All Packages Class Hierarchy This Package Previous Next Index
Class Webcrawler.Crawler.Crawler
java.lang.Object
|
+----java.util.Observable
|
+----Webcrawler.Crawler.Crawler
- public class Crawler
- extends Observable
- implements Observer
The Crawler represents the model in the MVC-concept. The model consists of
a URLTree object which organises all the nodes (the REAL model), of a todo-
and a done-Pool and the Readers which download the files from the net.
The Crawler communicates with one Controller through method-calls, and
the attached Visualizers through the observabel-interface.
The Crawler itself uses the observer/observable-system for synchronizing the
todo-, done-pools and readers.
For a more detailed description of how the Crawler works exactly please see
the CrawlerDetails.HTML file.
- See Also:
- ControllerInterface, URLTree, FIFOQueue
-
controller
-
-
gotbackParsers
-
-
gotbackReaders
-
-
maxThreadNum
-
-
parsers
-
-
readers
-
-
sentoutParsers
-
-
sentoutReaders
-
-
todoParsers
-
-
todoReaders
-
-
tree
-
-
Crawler(ControllerInterface)
- Creates a new Crawler - which creates the 2 pools and the readers.
-
checkForCrawlerDone()
-
-
checkUnevenWorkload()
- Checks if one of the 2 pools has more nodes waiting than the other.
-
getParsersCounter()
-
-
getParsingNodes()
-
-
getReadersCounter()
-
-
getReadingNodes()
-
-
getTodoParsersElements()
-
-
getTodoReadersElements()
-
-
nodeIsDone(URLNode)
-
-
notifyVisualizers(int, Object)
-
-
processParsersMessage(Observable, ParsersMessage)
-
-
processReadersMessage(Observable, ReadersMessage)
-
-
sendNodeToReaders(URLNode)
-
-
setMaxThreadNum(int)
- Called by the Controller to set the maximum number of Reader/Parser-threads
-
start(String)
- Called by the Controller to start the Crawler
-
stop()
- Called by the Controller to stop the Crawler
-
update(Observable, Object)
- Called by either the Parsers or the Readers when a node is done.
controller
private ControllerInterface controller
tree
private URLTree tree
todoReaders
private FIFOQueue todoReaders
todoParsers
private FIFOQueue todoParsers
readers
private Readers readers
parsers
private Parsers parsers
maxThreadNum
private int maxThreadNum
sentoutReaders
private int sentoutReaders
gotbackReaders
private int gotbackReaders
sentoutParsers
private int sentoutParsers
gotbackParsers
private int gotbackParsers
Crawler
public Crawler(ControllerInterface controller)
- Creates a new Crawler - which creates the 2 pools and the readers.
The Crawler is an observer of the done-Pool. The Readers-object connects
itself to the todo- and the done-Pool as an observer.
The Crawler needs a Controller for operation. That Controller tells
the Crawler where and when to start, if a link should be loaded, and
other things.
- See Also:
- ControllerInterface
setMaxThreadNum
public void setMaxThreadNum(int maxThreadNum)
- Called by the Controller to set the maximum number of Reader/Parser-threads
start
public void start(String startURL) throws MalformedURLException
- Called by the Controller to start the Crawler
stop
public void stop()
- Called by the Controller to stop the Crawler
sendNodeToReaders
private void sendNodeToReaders(URLNode un)
update
public synchronized void update(Observable observable,
Object message)
- Called by either the Parsers or the Readers when a node is done.
notifyVisualizers
private synchronized void notifyVisualizers(int vmtype,
Object node)
processReadersMessage
private void processReadersMessage(Observable readers,
ReadersMessage m)
processParsersMessage
private void processParsersMessage(Observable parsers,
ParsersMessage m)
nodeIsDone
private void nodeIsDone(URLNode n)
checkForCrawlerDone
private void checkForCrawlerDone()
checkUnevenWorkload
private void checkUnevenWorkload()
- Checks if one of the 2 pools has more nodes waiting than the other.
The Readers/Parsers that has significantly more to do gets more threads
than the other one.
getTodoReadersElements
public Enumeration getTodoReadersElements()
getTodoParsersElements
public Enumeration getTodoParsersElements()
getReadingNodes
public Vector getReadingNodes()
getReadersCounter
public int getReadersCounter()
getParsingNodes
public Vector getParsingNodes()
getParsersCounter
public int getParsersCounter()
All Packages Class Hierarchy This Package Previous Next Index