DDOS attack

Many apologies to anyone struggling with Textise today. Unfortunately, some idiot has been using it as a proxy for a denial of service attack on booking.com, causing server and application errors.

My hosting company have had to throttle Textise to protect other sites on their servers and, at the moment, I’m trying to persuade them to remove the throttling because I’m now blocking booking.com, which should stop the problem.

Once again, many apologies. Hopefully normal service will be resumed very soon.

Update 23:49 BST: Textise (and SEOtext) working fine now. I’ve had to block booking.com to stop the DDOS attacks. Annoying but it has worked. Sorry for any inconvenience.

Advertisements

SEOtext – “What the Search Engine saw”

SEOtext logo2

Announcing our newest product: SEOtext, the SEO tool that’s powered by Textise. SEOtext gets right into the textual content of a web page.

SEOtext can analyse the text of a web site to give you invaluable information about keyword frequency, which is, according to our new prospective partner, “truly ground breaking” for SEO. SEOtext can also help you to detect copyright breaches by extracting the “raw” text of a page.

For more information, see the new SEOtext Page.

Textise 3.0

Searching For A Heart Of GoldI’m pleased to say that I’ve now published a new version of Textise. Let’s call it version 3.0. This version sports a brand new hairstyle and an enquiring mind.

Seek And Ye Shall Find

First up, you now have the option of displaying a search box on some sites. This option is not switched on by default because, to be honest, it breaks the “text only” model, but you can enable it on the Options Page. When enabled, you’ll find a search box on selected sites, right at the top of the page. The feature doesn’t work with every site in existence containing a search form because the job of disentangling the myriad ways in which forms are submitted is too much for my brain. Instead, I’ve added  search boxes to popular sites whose search functions are easily replicated. As of writing, these are:

addons.mozilla
about.com
amazon.com
amazon.co.uk
ask.com
baidu.com
bbc.co.uk
bing.com
cdwow.com
dictionary.cambridge.org
ebay.com
ebay.co.uk
google.co.uk
google.com
imdb.com
twitter.com
metacritic.com
reddit.com
sogou.com
taobao.com
wikipedia.org
yahoo.com
yandex.ru
youtube

I’ll be adding others as I find (and test) them. If you have suggestions for other sites, just write to me via the Contact Page.

Now that pages can contain search boxes, I’ve also decided to reduce the number of search engines available from the home page. These are now Bing, Google and Yahoo only.

Skip to content

Regular users of Textise will know that it skips over the nasty navigation section of a page, directly to the main content – but only sometimes! This limitation is imposed by the page itself: if Textise can find an internal bookmark and a “Skip to content”-type link that points to it, it uses it. Otherwise you’re stuck with ten screens of unpleasantness to scroll through.

So it’s a useful feature, although not previously optional. Now, however, you’ll find that you can disable it on the Options Page. It’s still enabled by default. One reason you might want to disable it is that the new search boxes appear at the top of the page, not at the start of the content, and you may not want to skip straight past them.

*** Stop Press ***
Another option is now available: Strip navigation. This option will remove all content up to the main content, assuming Textise can identify it. It requires you to also have “Skip to main content” selected. Doesn’t, unfortunately, work with PDF conversions. I’ll have a think about that.

PDF

I’ve wanted to add the option to convert Textise’s text only output to PDF format for a while now but just couldn’t find an easy-to-use, cost-effective (free) solution.

So I’m pleased to report that pdfcrowd has come to the rescue, offering a fantastic free service that requires the addition of a single line of HTML code. Genius. Try the “Convert this page to a PDF” option next time you convert a page to text. Very useful if you want to save your text only output to read later.

ALEXA

There’s lots of anti-Alexa emotion on the ‘Net and there are some valid doubts expressed about its accuracy/relevance/usefulness. However, when Textise’s Alexa rank tumbles (which is a good thing) like it has been recently, I’m happy to go along with it.

Today’s rankings for Textise (I’m writing this on 21/02/2014) are 355,852 globally, 16,486 in the UK. These are pretty good numbers when you consider that there are over 650 million web sites in the world, over 10 million of them in the UK (with a “.uk” address).

Update 22/02/2014: Ha! The Alexa scores have now gone up! Told you it was rubbish. 🙂

Privacy

The use of cookies on web sites is becoming a big topic so I’ve finally got round to adding a Privacy Policy page where you can see how Textise uses them. In a sentence, Textise uses cookies to store your preferences and to provide anonymous analytical data (using Google Analytics). The analytical data is really useful because it allows me to see which sites are being converted to text and where users are located.

You’ll notice a pop-up nag box on the site now, asking you to accept the use of cookies. I’m afraid this also appears on the text only output but I have to make sure that all users see it and not everyone visits the home page (lots of people use the Firefox add-on or the bookmarklet and never need to go there – I know this because of the analytical data I get!).

Rest assured though, you need only click the acceptance button once and it’ll disappear from the whole site, forever (as long as you don’t delete your cookies!).

FIXES

Both the site and the web service have been spruced up a bit.

I noticed that the Mail Online site was causing Textise to crash out for no obvious reason. It’s a pretty nasty site, with long, long pages of links to the usual salacious celebrity stories and bigotry wrapped up as journalism. As you can tell, I love the Daily Mail but I didn’t let my personal feelings get in the way. It turned out that the problem was caused by the presence of unusual (and unnecessary) hexadecimal characters in the HTML of the pages, which I promptly zapped. All good now (apart from the wretched rag itself).

Some Chinese sites were also causing problems, this time by including “on” events in image tags (for example, onerror=”…javascript…”). This made an absolute mockery of my image processing code. Also now fixed.

I’ve also modified various other bits of code, sometimes in a spirit of tidiness, other times to very slightly improve performance. Actually, performance has improved dramatically anyway since the recent move to Hosting UK but you know what that (UK) supermarket ad says. No? Oh well.

If you have any comments or suggestions for Textise, please get in touch via the Contact Page or add a comment on the Feedback & Suggestions Page.

And your host for tonight is…!

Fix-My-Broken-Website_Tucson_Web_DesignThis weekend (Saturday 23/11/2013) I intend moving Textise to the new hosting on HostingUK. There’s absolutely no way this can go wrong, of course, but I thought I ought to warn you anyway…

You’ll know when Textise is running on the new servers if you see “Textise is now hosted by HostingUK” at the top of your text only output. Or if the site’s completely broken (which, if you remember, definitely can’t happen).

Textise has been really busy this week, which is great to see. And they said that we didn’t need text only pages! Hah!

Update 23/11/2013 13:36 GMT

The move to our new hosting has now completed successfully!

Please note that you may have to first remove the “textise.net/textiseOptions” cookie from your browser before you’ll be able to make changes on the Options page.

Note also that, at the moment, you’ll be sent back to the home page after making changes in Options. This is because of some changes I had to make due to a weird error being thrown during cookie processing. I’ll be fixing this later!

Update 23/11/2013 14:35 GMT

Fixed: After making changes on the Options page, you’ll now go back to the text only output you were viewing.

Tips ‘n’ Tricks #2

A few more ways to make your text only output look funky/clear/interesting…

Easy on the eye

Black Tahoma font, 18pt, on a silver background. Like Sunday morning.

Easy on the eye

Exercise book

Proper old skool, this one. It uses the “Lined paper” background texture and the MV Boli font in blue. Add smudges and doodles to taste.

exercise book

Column

You can make text easier to read by restricting its width on the page: reducing eye/head movement is important in reducing eye strain. Use the “Text width” option to set a comfortable size. You can optionally add a border on the right-hand side, dotted or solid (it’s dotted in this example).

column

Manuscript

This is fun – you can make any web page look like something out the Lord Of The Rings or the Dark Ages. Just select “Manuscript” from the background texture drop-down and Maroon MV Boli.

manuscript

After writing this post I’ve been inspired to have a look for some more textures – and I might have a look at using Google fonts too. Keep an eye out!

Tips ‘n’ Tricks #1

Textise is a lot more configurable than people often realise. Using the Options page, you can choose how you’d like you text only output to look, from font colour to the way that links are formatted. The Options page can accessed from the Textise home page or any “Textised” page.

Textise Options page

Below are a few ideas and suggestions – just click the thumbnails to see bigger versions.

Vanilla

This is the way that the text only output is formatted by default. Font is Tahoma 14, links are in bold, underlined when hovered over.

Default settings

Underlined links

In this mode, links on the page are shown underlined instead of bold. To achieve this, just change “Link appearance” to “Underlined”.

Underlined links

No links

To get that true “text” feel, all links can be shown as plain text by selecting “Plain text” in the “Link style” section. Note that, in this mode, the links are no longer clickable: they really are “text only”.

No links mode

The other choices in the “Link style” section are “Textised” and “Original”. “Textised” means that links will lead to the text only version of a target page; “Original” means that links navigate to the “real” page.

Computer console

Finally for today, here’s a fun config: the full geek look, complete with Console font and lime text on a black background.

Computer console

More tricks ‘n’ tips soon!

Hosting problems

imagesApologies to everyone who’s experienced problems with Textise today. It looks like my hosting company, WebHost4Life (yeah, ironic name), has messed up my config again. The home page and other HTML pages are fine but the underlying web service is broken, which is why you’ll get an error if you try to convert any pages.

I raised a ticket on them at about 10:00 UK time and have received one reply saying they’re looking into it, but the site is still broken.

I’ll continue chasing. Apologies again.

Update 17:27 GMT: Used Live Chat to request an update. Their first response was to ask me to empty my cache. This didn’t go down well. Finally convinced them to test it properly and they identified what seems like a permissions error, obviously introduced at their end. Threatened to move my business elsewhere but I don’t expect that to make any difference. They’re investigating.

Update 22:25 GMT: Chatted again and received sympathy but not much else. I pointed out that, at that point, my ticket had been open for 10 hours and the site is still unusable. The ticket’s been open for over 12 hours now and still nothing. I’m going to bed now in the hope they might pull their fingers out while I sleep. Many apologies again. I’ll update in the morning.

Update Sun 27/10 10:58 GMT: Latest activity on the ticket reports no problem at the hosting company’s end, in spite of the fact that I’ve now told them a billion times that I’ve made no changes since September and the site was fine on Friday. I’ve done some more digging and it looks like the page that calls the web service is getting a 403 Forbidden response. I’ve changed Textise’s error message in the hope it might reduce user frustration. Next thing – back on the chat so they can advise me to clear my cache again.

Update Sun 27/10 13:35: No news I’m afraid. As a test, I’ve set up a new directory for the web service, deployed to it, configured it as an application, waited for an hour (effect not immediate), modified Textise to use the new web service, re-deployed Textise, tested it and… still get the error. Not sure what to do next, tbh. Waiting for a response from Webhost4life…

Update 28/10 08:13 GMT: Textise is now back. The ticket was updated at about 05:00 GMT this morning. No explanation, though, so I will be going back to them to find out what exactly happened. Many apologies, once again, for this outage. Rest assured I will now be seriously looking for more dependable hosting.