Introducing the Blippex Firehose API, a world first!
Last week, when we released the first database dump of Blippex, mgamache on Hackernews said it would be great to have some kind of firehose API like Twitter has—a nearly realtime feed of the URLs indexed by Blippex.
We thought "that's a pretty cool idea!" and now present the Blippex Firehose API!
The Blippex Firehose API is streaming the URL and the title of every page that is getting indexed by Blippex, in near-realtime. We have decided to use SockJS to stream the data because it is the most used websocket library with direct websocket access, working fallbacks if you are, for example, behind a proxy-server and is actively maintained.
Accessing the Blippex Firehose API
This is the code you need to access the Blippex Firehose API, using Javascript:
<script src="http://cdn.sockjs.org/sockjs-0.3.min.js"></script>
<script>
//Open a SockJS connection to the Blippex firehose
var sock = new SockJS('https://firehose.blippex.org');
//listen to the connection for messages
sock.onmessage = function(e) {
//Data is JSON-object, so parse it
var data=JSON.parse(e.data);
//write to DIV element
document.getElementById('fh').innerHTML=data.title;
document.getElementById("fh").href=data.url;
};
</script>
<a id="fh" href="#" target="_blank">If you don't see links here please update your browser</a>
As you can seem no registration is involved—and again, the Blippex Firehose API does not use any cookies and we don’t store any IP addresses. The result of this script is shown here:
We have a nicer, quite addictive demo accessible here for you! From what we know, no other search engine has a Firehose API like this, so this is a world’s first!
Dataformat
After opening a SockJS connection to https://firehose.blippex.org
the API sends JSON objects like this to the client:
{"url":"http://wbbc.co.uk","title":"BBC - Homepage"}
We can’t wait to see what developers will do with this! If you have any ideas for improvements, please let us know.
Future
This of course is just a start, we want to implement filters so that you can get only the pages that have a specified keyword in it. Think of it like the Twitter firehose API but for web-pages that were seen by real people!
And finally—to help improve Blippex, please tell your friends and help spread the word. The more people get in, the better it will get for everyone.
Thank you & keep on searching & discussing!