Search engines - Technology

A search engine consists of the following specific elements: a Url server, multiple crawler (bots, robots, Spider), the parser and the store server.

The Url server manages Internet addresses (Urls), which is not yet in the index. Registrations of individual web pages to search engines will be mostly in the Url server. This is the data to the individual or the crawler.

The crawlers now convert each Url into an IP address in order to conform to the respective servers connect. To the duration of the transfer to shorten, and the server not to overload, opens a crawler often several hundred compounds simultaneously. If a connection problems, it can quickly and without much loss of time on the next connection access. Through this rotation procedure can only be a robot up to 30 pages per second. After successfully passing the crawler delivers the data of the site is the parser.

The parser now created by each HTMLSeite a simplified form and forwards it to the store server. A store server's task, in the pages contained information easier to extract. Included in the links are Url server, which contained text or terms are included if they are not yet known, added to the index.

The index is simply out of the Lexicon, the hit lists, and the repository. The Lexicon is usually in the form of a hash table collection shown all found the net terms. Only those terms, which are included in Lexicon, also provide independent search results. Each term, or any word of the Encyclopaedia contains a note on the appropriate Hit List. Hit Lists to every word contained references to the relevant pages in the repository. In the repository, the site is saved. At the same time, in the Hit Lists noted the importance of each site in relation to the various terms or keywords given.

To interact with a search engine to be able to use the front end or the Searcher. The front end is nothing more than the visible component of a search engine, so the user interface. If the user query, the Searcher one of the Lexicon and the Hit Lists created results list.

Each search engine uses its own algorithm, a method by which the results will be listed. This algorithm is the real heart of a search engine. It consists of hundreds of criteria, which will decide what position to a site about a particular query is listed ...

0 yorum: