G’Day people. Well as the heading of this article suggests, I am going to share with you some of my adventures I have had in the recent months. The adventure is in the virtual world of computers and internet rather than the real world.
· Get someone to download them for me.
· Discover of an automation, that could do that for me.
It’s not that I could not go with solution number 1, but what if I have to download another set of videos again after some time. Every time finding some to do the job for me was not an option I would prefer. I am a type of person who would rather do some learning, some discovery, some hits and trials and go to solution #2.
The problem are as pointed below.
1. Youtube does not have a direct link to download the video.
Luckily there are plugins which you could just install to the Firefox and it will add a download link to your youtube video page. You just have to click it and then save it. Now another problem which I later found was that, Firefox had these things called the profiles. When you install Firefox for the first time(I am using windows7), a default profile is made. When you install extensions on Firefox from default profile, it is available only for the instance of Firefox opened from default profile. Now selenium opens Firefox with a new profile by default. So the add ons are not available. I had three options.
· Configure Firefox to open with the default profile.
· Create a different profile and configure Firefox to use this one each time.
· Pass Firefox with the add-on files each time to add to itself.
Solution number was not a good idea, mixing normal browsing and development profiles as same. So I went for solution #2.
Steps.
Ø Create a new profile called selenium.
o Hit ctrl + r on windows
o Type “Firefox –profilemanager”
o Create a new profile named selenium, directory to somewhere other than the default place like E:FirefoxSeleniumProfiles
o Now while creating instance of selenium, configure it to use this profile.
The problem to the point where there is no download button available on the video page of selenium is solved. Now when you manually click the download button and then the mp4 option, a window appears asking you about what to do with the file. Two options are whether to open it with some registered program, or to save it to some location. Now this window is part of native windows and selenium is unable to handle the window.
So now, the problem is “How to force Firefox to save the file in disk quietly instead of asking for option”. And this happen to be easy. Just while instantiating Firefox, add some more options for that profile.
The problem is almost solved. The only logic remaining to write is about which keywords to search, how to find the video urls, and how to iterate through those urls. These are yours choice. I have added a sample code to iterate to the urls of the searched videos.
The Real problem
Every story begins with a problem. The problem in my case is that Internet is very expensive. And so, as magic comes with a price, so does the internet in modern age. I have to download some video tutorial series from youtube and I prefer not to do that from my home. I know a place where I could but I could not spend too much time sitting there and going to each and every video url and then downloading the video. Now, the quest starts. How to download the videos without spending much time there? I have two choices.· Get someone to download them for me.
· Discover of an automation, that could do that for me.
It’s not that I could not go with solution number 1, but what if I have to download another set of videos again after some time. Every time finding some to do the job for me was not an option I would prefer. I am a type of person who would rather do some learning, some discovery, some hits and trials and go to solution #2.
First steps
Now I happen to know of a tool, called a webDriver. A web driver is a tool which could communicate with a browser and tell it to perform what actions you desire through programming. I have done clicks, navigations', hovers, execution of javascript etc through it, and I truly believed that I could go one step further and do file downloads as well. I started digging.Second thought
OK, that was the beginning of the story. The overall steps and happenings of the story are not always funny or adventurous as you might have been thinking. Keeping this in mind, lets directly go to the interesting parts of findings and problem solving and skip the boring part of blood sucking labors and hopeless efforts.The technical problem
To find some videos in youtube is not really a big problem. It is very easy to write a selenium script that will open a browser, navigate to the youtube site and search for a term in the search box. Now when the search box is available. The real problem begins. Lets say we reached the video page where the video is being played.The problem are as pointed below.
1. Youtube does not have a direct link to download the video.
Luckily there are plugins which you could just install to the Firefox and it will add a download link to your youtube video page. You just have to click it and then save it. Now another problem which I later found was that, Firefox had these things called the profiles. When you install Firefox for the first time(I am using windows7), a default profile is made. When you install extensions on Firefox from default profile, it is available only for the instance of Firefox opened from default profile. Now selenium opens Firefox with a new profile by default. So the add ons are not available. I had three options.
· Configure Firefox to open with the default profile.
· Create a different profile and configure Firefox to use this one each time.
· Pass Firefox with the add-on files each time to add to itself.
Solution number was not a good idea, mixing normal browsing and development profiles as same. So I went for solution #2.
Steps.
Ø Create a new profile called selenium.
o Hit ctrl + r on windows
o Type “Firefox –profilemanager”
o Create a new profile named selenium, directory to somewhere other than the default place like E:FirefoxSeleniumProfiles
o Now while creating instance of selenium, configure it to use this profile.
Ok, the Firefox started with this WebDriver will use that particular profile. You could install the youtube downloader add-ons on that profile by opening its instance and installing it there.FirefoxProfile profile = new FirefoxProfile(new File("E:ProgrammingFirefox-selenium-profile"));
WebDriver driver = new FirefoxDriver(profile);
The problem to the point where there is no download button available on the video page of selenium is solved. Now when you manually click the download button and then the mp4 option, a window appears asking you about what to do with the file. Two options are whether to open it with some registered program, or to save it to some location. Now this window is part of native windows and selenium is unable to handle the window.
So now, the problem is “How to force Firefox to save the file in disk quietly instead of asking for option”. And this happen to be easy. Just while instantiating Firefox, add some more options for that profile.
FirefoxProfile profile = new FirefoxProfile(new File("E:ProgrammingFirefox-selenium-profile")); profile.setPreference("browser.download.folderList",2); profile.setPreference("browser.download.manager.showWhenStarting",false); profile.setPreference("browser.download.dir","E:ProgrammingDownloads"); profile.setPreference("browser.helperApps.neverAsk.saveToDisk","text/csv,video/mp4"); WebDriver driver = new FirefoxDriver(profile);These options override the default behavior of Firefox download-manager, specify a default download directory and also the mimetypes of files for which to quietly save the files.
The problem is almost solved. The only logic remaining to write is about which keywords to search, how to find the video urls, and how to iterate through those urls. These are yours choice. I have added a sample code to iterate to the urls of the searched videos.
import java.io.File; import java.util.List; import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.Firefox.FirefoxDriver; import org.openqa.selenium.Firefox.FirefoxProfile; public class WithExtension { static WebDriver getExtendedFirefox(){ FirefoxProfile profile = new FirefoxProfile(new File("E:ProgrammingFirefox-selenium-profile")); profile.setPreference("browser.download.folderList",2); profile.setPreference("browser.download.manager.showWhenStarting",false); profile.setPreference("browser.download.dir","E:ProgrammingDownloads"); profile.setPreference("browser.helperApps.neverAsk.saveToDisk","text/csv,video/mp4"); WebDriver driver = new FirefoxDriver(profile); return driver; } public static void main(String[] args) { WebDriver driver = getExtendedFirefox(); driver.get("http://www.youtube.com/watch?v=2V9h1aa-Cro"); driver.navigate().to("https://www.youtube.com"); driver.findElement(By.id("masthead-search-term")).sendKeys("java tutorial advanced leveln"); longWait(5000); WebElement videoSection = driver.findElement(By.id("gh-activityfeed")); List videos = videoSection.findElements(By.className("yt-lockup")); System.out.println("The number of videos is " + videos.size()); for(WebElement vcontent : videos){ WebElement vpanel = vcontent.findElement(By.className("yt-lockup-content")); WebElement vtitle = vpanel.findElement(By.className("yt-lockup-title")); System.out.println(vtitle.getText()); System.out.println(vtitle.findElement(By.tagName("a")).getAttribute("href")+"n"); } } public static void longWait(long a){ try { Thread.sleep(a); } catch (InterruptedException e) {} } }
No comments:
Post a Comment
If you like to say anything (good/bad), Please do not hesitate...