To crawl through all the links in a web page, you dont' need clicking on each link! You have the webBrowser1.Document object through which you can get all links in a page in many ways and then manipulate them as you like. For example:
You can put the following code in the evnt handler of a crawl button or the Navigated event of a web browser control to get all links:
HtmlElementCollection links = webBrowser1.Document.GetElementsByTagName("a");
foreach (HtmlElement ele in links) {
//Form3 is a form that contains a web browser control, an address combo box
//a GO button and a "Crawl Links" button. Besides it has the NavigateTo() method declared later
//However, you can do whatever you want here! Maybe open the links in the same window .. whatever
Form3 frmNew = new Form3(); frmNew.Owner = this;
frmNew.Show(); frmNew.NavigateTo(ele.GetAttribute("href"));
}
The NavigateTo method is included here:
public void NavigateTo(string url) {
if(url != null && url != "" && Uri.IsWellFormedUriString(url, UriKind.RelativeOrAbsolute))
webBrowser1.Url = new Uri(url);
}
The Navigating and Navigated events are good places to think of for placing your code:
private void webBrowser1_Navigated(object sender, WebBrowserNavigatedEventArgs e) {
this.Text = "Viewing: " + webBrowser1.Document.Title;
comboBox1.Text = webBrowser1.Document.Url.ToString();
}
This will update the title of the including form and the address combo box when the browser is at a certain site. This is for demonstration purposes!
The ProgressChanged event is indeed the right place to handle progress changes. I included a progress bar in my form and it is initailly set to Visible= false;
It is only shown when a page is loaded and the progress is taken from the event args of the event:
private void webBrowser1_ProgressChanged(object sender, WebBrowserProgressChangedEventArgs e) {
if (!progressBar1.Visible && e.CurrentProgress != e.MaximumProgress) { progressBar1.Visible = true; }
if (progressBar1.Maximum != e.MaximumProgress) { progressBar1.Maximum = (int)e.MaximumProgress; }
progressBar1.Value = (int)e.CurrentProgress;
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
progressBar1.Visible = false;
}
//This is the event handler for the GO button that takes you to the specified Address
private void button1_Click(object sender, EventArgs e) {
if (comboBox1.Text != null) {
progressBar1.Visible = true;
webBrowser1.Url = new Uri(comboBox1.Text);}
}
The above three handlers show how you coordinate between the three events!
Hope this code helped you answer what you have in mind!
|
My Blog Title
|