I’m pleased to say that I’ve now published a new version of Textise. Let’s call it version 3.0. This version sports a brand new hairstyle and an enquiring mind.
Seek And Ye Shall Find
First up, you now have the option of displaying a search box on some sites. This option is not switched on by default because, to be honest, it breaks the “text only” model, but you can enable it on the Options Page. When enabled, you’ll find a search box on selected sites, right at the top of the page. The feature doesn’t work with every site in existence containing a search form because the job of disentangling the myriad ways in which forms are submitted is too much for my brain. Instead, I’ve added search boxes to popular sites whose search functions are easily replicated. As of writing, these are:
I’ll be adding others as I find (and test) them. If you have suggestions for other sites, just write to me via the Contact Page.
Now that pages can contain search boxes, I’ve also decided to reduce the number of search engines available from the home page. These are now Bing, Google and Yahoo only.
Skip to content
Regular users of Textise will know that it skips over the nasty navigation section of a page, directly to the main content – but only sometimes! This limitation is imposed by the page itself: if Textise can find an internal bookmark and a “Skip to content”-type link that points to it, it uses it. Otherwise you’re stuck with ten screens of unpleasantness to scroll through.
So it’s a useful feature, although not previously optional. Now, however, you’ll find that you can disable it on the Options Page. It’s still enabled by default. One reason you might want to disable it is that the new search boxes appear at the top of the page, not at the start of the content, and you may not want to skip straight past them.
*** Stop Press ***
Another option is now available: Strip navigation. This option will remove all content up to the main content, assuming Textise can identify it.
It requires you to also have “Skip to main content” selected. Doesn’t, unfortunately, work with PDF conversions. I’ll have a think about that.
I’ve wanted to add the option to convert Textise’s text only output to PDF format for a while now but just couldn’t find an easy-to-use, cost-effective (free) solution.
So I’m pleased to report that pdfcrowd has come to the rescue, offering a fantastic free service that requires the addition of a single line of HTML code. Genius. Try the “Convert this page to a PDF” option next time you convert a page to text. Very useful if you want to save your text only output to read later.
There’s lots of anti-Alexa emotion on the ‘Net and there are some valid doubts expressed about its accuracy/relevance/usefulness. However, when Textise’s Alexa rank tumbles (which is a good thing) like it has been recently, I’m happy to go along with it.
Today’s rankings for Textise (I’m writing this on 21/02/2014) are 355,852 globally, 16,486 in the UK. These are pretty good numbers when you consider that there are over 650 million web sites in the world, over 10 million of them in the UK (with a “.uk” address).
Update 22/02/2014: Ha! The Alexa scores have now gone up! Told you it was rubbish. 🙂
I’m afraid this also appears on the text only output but I have to make sure that all users see it and not everyone visits the home page (lots of people use the Firefox add-on or the bookmarklet and never need to go there – I know this because of the analytical data I get!).
Rest assured though, you need only click the acceptance button once and it’ll disappear from the whole site, forever (as long as you don’t delete your cookies!).
Both the site and the web service have been spruced up a bit.
I noticed that the Mail Online site was causing Textise to crash out for no obvious reason. It’s a pretty nasty site, with long, long pages of links to the usual salacious celebrity stories and bigotry wrapped up as journalism. As you can tell, I love the Daily Mail but I didn’t let my personal feelings get in the way. It turned out that the problem was caused by the presence of unusual (and unnecessary) hexadecimal characters in the HTML of the pages, which I promptly zapped. All good now (apart from the wretched rag itself).
I’ve also modified various other bits of code, sometimes in a spirit of tidiness, other times to very slightly improve performance. Actually, performance has improved dramatically anyway since the recent move to Hosting UK but you know what that (UK) supermarket ad says. No? Oh well.