A search engine built by the crowd, that does not suck
Google’s PageRank has revolutionized search, but over the last ten years it has forced every other search engine to look and act like Google. This has made it almost impossible to establish a better, modern search engine other than Google, granted its popularity is the major indicator for its quality. Nice for Google, not so nice for healthy competition.
During our work at archify, a private search engine across the content of your browsing history, Max and I realised that there is one core metric that could be an alternative to the PageRank. We learned that the biggest commitment a user can make to a web page and content, is the time the user spends on it. We can safely assume that a web page where users only spend a few seconds on is not as important as a page where they spend several minutes. That’s a much stronger commitment than hitting a like button can ever be. And it’s a stronger indication for quality content than links between machines. We call this new way of ranking pages, DwellRank. Dwellrank is a better way to find the right search results, based on the time users actually spend on the page.
After 2 years of valuable experience with archify we decided to build a search engine called Blippex, which is based on the DwellRank. We’ve built extensions for all major browsers (sorry IE), which sends the URL of the web page and the time they spent on it to our servers. The database is built by the input from humans, which helps create the Blippex search engine. It is also available for everyone, even if you have not installed the extension (but you should install it!). The challenge around Blippex is finding enough people who are willing to share their data. As a result, we have created many privacy safeguards. If you use the extension we will only save three data points:
- the URL
- the current time
- the time you spent on the page
It is very important to make clear that we do not save anything which could possibly identify you. It is also important to say that if you have any ideas on how to improve the security of Blippex even further, we are happy to do so. But let me get back to Blippex and why we built it.
A search engine built by the crowd, that does not suck
It is very important to make clear that we do not save anything which could possibly identify you. It is also important to say that if you have any ideas on how to improve the security of Blippex even further, we are happy to do so. But let me get back to Blippex and why we built it.
Search should be controlled by the people using it
Search should not be a blackbox. People should be in control of pretty much everything. At Blippex, people provide the data for the search engine and help make it better. They are also in control of the search results and can easily manipulate the different parameters.
Search Parameters should be transparent
Every parameter of the search should be available for the people who are using it. That said, it’s a work in progress, especially with the scoring of search terms, but it is possible to control how many days the search should go back and influence the final scoring in terms of searchterm frequency (how often the searchterm is mentioned on the page) versus DwellRank.
Search Ranking data should be available
Although we know the basic parameters of search algorithms, their complexity is hidden in a big blackbox. Blippex intends to open its search algorithm data. As a first step we will include the scoring details into our public API, which will be available for everyone. This is the same API that Blippex.org uses.
The Search database should be open
Not only should the ranking be transparent but also the data itself. Especially at Blippex, where the data comes from our users we think it is our duty to give it back to the users. Therefore we will publish a dump of our database every month.
Searching needs strong privacy
We do not think that there is anything wrong with tracking your users because it is sometimes the only way to improve your service. But the users should always be aware what is being tracked about him and how it is stored or even shared. Search engines have a special duty because they are dealing with with very sensitive data about their users and their interests. That’ why we are trying to keep the privacy level at the highest possible level.
We are releasing Blippex today and invite you to participate in making a search engine that doesn’t suck and is built by the people, for the people.
Gerald & Max PS: Why the heart in the Blippex logo? Only humans have a heart!
Warning: Looking for previous comments on this article? Please refer to this blog entry.