Textise can successfully extract text from thousands of web pages.
However, because of the enormous variety of HTML currently in existence on the Web, some of it badly-formed, we can’t guarantee that Textise will work every time.
So, if you’ve encountered a problem , please add a comment to this page and paste in the error message you received, along with any other information you think would help us solve the problem. If we can, we’ll adjust the Textise algorithms to cope with the problem page.
This is a good example of a page that makes Textise fall into an undignified heap:
“The address in error was http://c.moreover.com/exchange/?iref=ireportglobal.
If you like, you can try to navigate to the un-Textised version of that page or go to the Textise homepage.
Please report this error by copying all of the information on this page into a comment on the error reporting log. Thanks!
The exact error was:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Net.WebException: The remote server returned an error: (404) Not Found. at System.Net.WebClient.DownloadDataInternal(Uri address, WebRequest& request) at System.Net.WebClient.DownloadData(Uri address) at System.Net.WebClient.DownloadData(String address) at textiseService.Service1.stripHTML(String strUrl) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 88 — End of inner exception stack trace —”
The problem’s caused by the fact that I reached the CNN site via a kind of affiliate link on the BBC News site. At the moment, I have absolutely no idea how to fix it!
LikeLike
arabic.cnn.com… relative path – why no domain?
The address in error was:
http://2009/world/3/1/iran.mullen/index.html
Error details:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Net.WebException: Unable to connect to the remote server —> System.Net.Sockets.SocketException: A socket operation was attempted to an unreachable network 0.0.7.217:80 at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) at System.Net.Sockets.Socket.InternalConnect(EndPoint remoteEP) at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception) — End of inner exception stack trace — at System.Net.WebClient.DownloadDataInternal(Uri address, WebRequest& request) at System.Net.WebClient.DownloadData(Uri address) at System.Net.WebClient.DownloadData(String address) at textiseService.Service1.stripHTML(String strUrl) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 88 — End of inner exception stack trace —
LikeLike
Bookmarks become links back to Textise.
Presing Enter on the Textise homepage causes an error.
LikeLike
Got this after going to the bbc football page, then clicking on myteam. Pasting this url into the browser seems to work fine though.
The address in error was:
http://www.bbc.co.uk/sport1/hi/football/teams/default.stm
Error details:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Net.WebException: The remote server returned an error: (404) Not Found. at System.Net.WebClient.DownloadDataInternal(Uri address, WebRequest& request) at System.Net.WebClient.DownloadData(Uri address) at System.Net.WebClient.DownloadData(String address) at textiseService.Service1.stripHTML(String strUrl) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 88 — End of inner exception stack trace —
LikeLike
Not really an error, just an uncoded feature as yet…
When you click a non-html link in textise such as a pdf or word doc to download, you might just want to download it. Textise seems to think that as it is not html, it does not compute, so sends an error. Can it be amended so that non html files can pass straight through?
See example below
The address in error was:
Click to access Dec2008ages5-10.pdf
Error details:
There is an error in XML document (6, -37).
LikeLike
RSS feed:
The address in error was:
http://newsforums.bbc.co.uk/nol/rss/rssmessages.jspa?forumID=6205&lang=en_GB
Error details:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.ArgumentOutOfRangeException: Length cannot be less than zero. Parameter name: length at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) at System.String.Substring(Int32 startIndex, Int32 length) at textiseService.Service1.stripHTML(String strUrl, Int32 linkStyle) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 162 — End of inner exception stack trace —
LikeLike
The address in error was:
http://windowsteamblog.com/blogs/MainFeed.aspx
Error details:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex at System.Globalization.CompareInfo.IndexOf(String source, String value, Int32 startIndex, Int32 count, CompareOptions options) at System.Globalization.CompareInfo.IndexOf(String source, String value, Int32 startIndex) at System.String.IndexOf(String value, Int32 startIndex) at textiseService.Service1.stripHTML(String strUrl, Int32 linkStyle) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 171 — End of inner exception stack trace —
LikeLike
BBC…
The address in error was:
http://www.bbc.co.uk/go/gmast/mob/-/mobile/
Error details:
startIndex cannot be larger than length of string. Parameter name: startIndex
LikeLike
Searching on amazon.co.uk. Looks like it’s only really this content that’s causing a problem. Something to do with “Luxury accommodation located near Elvis Presleys Graceland” – is this a double-encoded character (“”)?
————
The address in error was:
http://www.amazon.co.uk/s/ref=nb_ss_w_h_?url=search-alias=aps&field-keywords=graceland paul simon
Error details:
There is an error in XML document (1832, -538).
LikeLike
I tried the options, they worked in IE6, but not in Chrome. It seemed to have problems on the cookie aspx page, but cookies were set to allow all. IE6 was fine though and I enjoyed the lurid yellow and green super large fonts etc. Perhaps it is a firewall thing?
LikeLike
Have re-tried Chrome, mais malheureusement cela ne fonctionne pas. I changed a couple of features in the options page, but it got stuck on Times new roman at font 24 black on yellow. I’ll try and clear cookies and try again.
LikeLike
No, clearing cookies does not help.
LikeLike
The address in error was:
http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=storage&articleId=9134121&taxonomyId=19&intsrc=kc_top
Error details:
There is an error in XML document (1150, -1994).
LikeLike
Hi – nice tool. I was testing it on my own website but found a problem – once the page has been textised, the links from the page were broken.
Eg, if you textise this page:
http://www.zaonce.com/stuff/kettlechips.shtml
the links now have the wrong URL.
Looks like something to do with using the “base href” tag?
LikeLike
Great program – best for netbooks -, but the hungarian accented characters couldn’t be displayed.
Can You fix it?
Thanks,
B.
LikeLike
I think this is OK for Boletus now. I haven’t run any really methodical checks but I think the answer’s to choose a different font on the Options page. As far as I can see, Arial and Tahoma work fine.
This problem – and the technical term for these sort of accents is “diacritics”, by the way – was also pointed out by a reviewer on the Firefox Add-ons site, who gave the program 2 out of 5 stars as a result! Meh.
LikeLike
Error details:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Net.WebException: The remote server returned an error: (404) Not Found. at System.Net.HttpWebRequest.GetResponse() at textiseService.Service1.stripHTML(String strUrl, Int32 linkStyle) in C:\Users\Rock-Boy\Documents\Visual Studio 2008\Projects\textiseService\textiseService\Service1.asmx.cs:line 65 — End of inner exception stack trace —
LikeLike
Johhny – could you please provide me with the address of the page you were trying to Textise. Thanks!
LikeLike
The address in error was:
http://www.peterrussell.com/LGN/JustLGo.html
Error details:
startIndex cannot be larger than length of string. Parameter name: startIndex
LikeLike
All fixed now. Problem was with an inconsistent tag on the page. Textise should have coped, of course, but it didn’t!
LikeLike
Thanks for the prompt response Ian. Cheers!
LikeLike
This is a weird one – please fix immediately!
The address in error was:
http://www.bbcshop.com/Drama Arts/Lark-Rise-To-Candleford-Series-3-DVD/invt/bbcdvd3183
Error details:
Server was unable to process request. —> The remote server returned an error: (404) Not Found.
LikeLike
Yeap, still not working, need a hand??
LikeLike
Hi Jorge – many thanks for the offer: help would be appreciated. To be honest, I’ve been too busy with work, family and music projects to give this my attention. (Sorry, everyone!) I had a horrible problem with the program itself after my hosting company migrated to new servers and that pretty much took up all my spare coding time!
Please feel free to have a look at the code and let me know what you think.
Thanks again.
LikeLike
Firefox 3.6.8 Linux 32 bit
Textize context menu item is not there. Totally unable to use the plug in.
LikeLike
Yes, apologies – the add-on doesn’t work with FF3.6. I will be working on this in the next week or so.
In the meantime, you can still use the program by navigating to http://www.textise.net.
LikeLike
Hello Ian,
You will be happy that we have included Textise in Vinux 3.0. Vinux 3.0 is an Ubuntu based distribution that is optimized for blind and visually impaired users. Unfortunately we are using Firefox 3.6.8 as browser.
You may not know but the only web browser that is accessible in Linux (other than text browsers in the Console) is Firefox.
I would love to see some way of using Firefox code to drive textize in the console – even if Firefox had to open up a X Windows GNOME desktop on the computer.
The problem we are having currently is that many banks are not accessible with text browsers. I think using Firefox and textise would greatly help.
LikeLike
Hi David,
I’ve now fixed the Firefox add-on for Textise so it works with Firefox 3.6. I’m really happy that you’re using Textise for your Ubuntu distro – hopefully this fix will solve the problems you’ve been having.
Unfortunately, I don’t think Textise will ever be usable for accessing banking sites. The way that it works means that I’d have to grab the user’s details for logging in and carrying out the various transactions (the HTTP request), which would create all kinds of problems around security and data protection. As well as being technically tricky I’d be quite worried about Textise acting as a sort of proxy for financial transactions and the associated risk of litigation if things went wrong!
LikeLike
One from me. I think an array might be blowing…
The address in error was:
http://www.famouswhy.com/Born_Where.html
Error details:
Exception of type ‘System.OutOfMemoryException’ was thrown.
LikeLike
And another one…
The address in error was:
http://features.metacritic.com/features/2010/microsoft-xbox-kinect-review-roundup/
Error details:
Server was unable to process request. —> The remote server returned an error: (403) Forbidden.
LikeLike
The address in error was:
http://yusu.org
Error description: The remote name could not be resolved: ‘yusu.org’
LikeLike
Hi,
Please try using the full URL – http://www.yusu.org. Works fine, in spite of the “Whoa! That’s an old browser you’re using there!” message you’ll see.
LikeLike
textise is mistaken for a bot. it will not start properly it sends error message
LikeLike
Hi,
Can you tell me exactly when this happens? As far as I can see, Textise is currently working properly. The “bot” problem usually occurs when trying to search using Google rather than going directly to a URL.
LikeLike
The address in error was:
http://winfuture.de/
LikeLike
Hi Werner – Apologies, looks like Textise is down. Must be a problem with my hosting company. Can’t do much until later today. Hopefully they’ll fix it before then!
LikeLike
Apologies to everyone who’s found that Textise is down today. I’ve raised a ticket with my hosting company and await their response…
LikeLike
It looks like Textise is back! Many thanks for your patience.
LikeLike
The address in error was:
http://www.avclub.com/articles/10-episodes-that-take-you-inside-the-weird-world-o,98601/?utm_medium=RSS&utm_campaign=feeds&utm_source=avclub_rss_daily
LikeLike
I’m aware that Textise is still broken. My hosting company have tried updating the IIS settings but that hasn’t helped. They continue to investigate. Once again, many apologies for this outage.
LikeLike
All fixed now!
Message from my hosting company:
Hello,
Thank you for your patience.
There was some temporary issue with application pool. We have recycled the application pool, now search link and all other link at: http://textise.net/ are loading fine.
LikeLike
The address in error was:
https://www.dropbox.com/sh/dw0bsy85qg6anz4/CWF9x2OoeB
and i bet it was because of their *&#@/?! if-top-location-not-self-location-then-who-cares-about-*your*-html script — the very thing i was hoping to be a clever fox and circumvent by means of textis-ation.
btw, absolutely and surprisingly awesome service. surprising, in context of the majority of services encountered by this end user, *not* in light of your character, which of course i had not taken into consideration, being unfamiliar.
here’s hoping —
LikeLike
Hi @femalefaust – thanks for the error report (and the kind words about Textise). I’ll have a look today.
LikeLike
Hi @femalefaust,
I’ve had a good look at this problem and talked to the folks who host Textise. It seems that, because the page is so huge and contains so many images and links, Textise times out when trying to process it. I could increase the time-out value for the program but that would have a knock-on effect on other users, so I’ve decided not to do it.
Many apologies for this – I hope you’ll still find other uses for Textise.
LikeLike
understood — and — i could finagle my end with that info.
LikeLike
Normal browser window showed the page OK.
The address in error was:
http://www.bbc.com/capital/story/20140625-battle-of-the-sexes-office-editi
Error description: The remote server returned an error: (403) Forbidden.
LikeLike
Hi Chuck,
Ah, now this is an interesting one. The page you’re trying to Textise isn’t accessible from the UK. When I try to access it from the UK I get this message:
Now, you may be thinking, “But I’m not in the UK!” Thing is, you might not be in the UK but Textise is, and that’s why it can’t get to the page.
Really sorry about this. At the moment I don’t have a solution but I will give it some thought.
LikeLike
The address in error was:
http://www.popsci.com/diy/article/2011-08/e9Wl7MmRImk29ZEQ.03
Error description: The request was aborted: The connection was closed unexpectedly.
LikeLike
Hi Christine – Looks like that page doesn’t exist, which is why Textise responded the way it did.
LikeLike
Hi! Thank you for this.
The last day or so it doesn’t seem to like Google results. Textise returns immediately with this error:
The address in error was:
http://www.google.com/search?q=fjord
Error description: The remote server returned an error: (503) Server Unavailable.
I am able to search Google directly using the same URL.
LikeLike
Hi Scott,
Yes, I noticed this myself. The problem is unfortunately with Google rather than Textise: the Google servers occasionally decide that Textise is a bot and block it. This is in spite of the ironic fact that Google is built on bots! I’ve tried contacting them in the past but they’re very elusive, you know.
So, apologies for the problem but all I can advise is that you use a different search engine with Textise for a while. I expect that, once again, the Google servers will start accepting Textise calls in the fullness of time.
Ian
LikeLike
The address in error was:
http://www.sancadilla.net/2014/09/columna-san-cadilla-norte-01-septiembre.html
LikeLike
Hi Orlando,
Sorry about that – the page contained a coding error, so I’ve adjusted Textise to fix it. Please try again.
LikeLike
also — will you ever support data URIs? is there a way if i ask nice?
LikeLike
You make a good point about data URIs – I’ll have a look!
LikeLiked by 1 person
that would be FIERCE!
LikeLike
The plug-in claimed to have successfully Textised http://gregdavisevent.com/lambo/ but actually nothing, not a single word of the original page came thrrough! Very strange.
LikeLike
Hi Martin,
The whole of the content on that particular page is displayed in an IFRAME, for reasons that escape me. Textise doesn’t process IFRAMEs, which is why there’s no output.
I’d understand if you thought that Textise ought to cope with IFRAMEs and I will consider making the necessary changes but it’s worth bearing in mind that the use of an IFRAME is generally regarded as bad design, especially when the whole page is contained in one!
Ian
LikeLike
Hey. Site is fantastic other than a small “error” that’s present in IE8 on my work PC. Instead of delivery an apostrophe, Textise gives me “'” such as:
The film The Matrix models Plato's Allegory of the Cave.[ 5]
Thanks for your help.
LikeLike
Hi Jooo,
Sorry for the delay in replying.
Unfortunately, the apostrophe in your comment looks fine! Could you let me have the URL of the page you’re having trouble with?
Thanks,
Ian
LikeLike
The homepage of Norway’s largest onlin news paper, http://www.VG.no, is currently not textised correctly. Links to sections (or “subfolders”) of the site work fine. However, the links to the individual news articles does not work.
To experience the error, do this:
1) Visit homepage http://www.textise.net/showText.aspx?strURL=www.vg.no
2) Look for this text: [Image: TIPS VG]
3) Click the first link that occurs below above text (below [Image: TIPS VG])
ERROR:
* Expected textized landing page: http://www.vg.no/forbruker/forbruker/selger-innsamlede-klaer-paa-internett/a/23408099/
* Actual textized landing page: http:///forbruker/forbruker/selger-innsamlede-klaer-paa-internett/a/23408099/
LikeLike
Hi Leif,
Thank you for reporting the error.
I’ll have a look!
Ian
LikeLike
Hi Leif,
It should work fine now. The problem was with a badly-formed “BASE” tag on the site. I’ve amended Textise to ignore such things!
Thanks you for letting me know about the problem.
Ian
LikeLike
The address in error was:
http://www.newsweek.com/2015/04/24/inside-one-most-murderous-corporate-crimes-us-history-322665.html
Error description: The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF
LikeLike
Thanks for letting me know about this. I’ll have a look. Probably something weird on the page.
LikeLike
This Newsweek site is very odd! I fixed one problem – the Newsweek server was returning an “unsafe header” – only to find that Textise was then ignoring the content. I have no idea why, at the moment, but I’ll have another look when I get a minute. I’m assuming some sort of made-up (!) HTML5 tag but not sure.
LikeLike
Hmm, the Newsweek just isn’t returning any content when I make a web client call. I can only think they’re either cleverly blocking what might be perceived as a bot or their server’s configured wrong. The latter seems likely when you consider the unsafe header thing I mentioned previously but I can’t be sure.
LikeLike
i would like to report my own error. i am sorry if i used textise unwisely in an attempt to iframe google searches. it appears as if the site is no longer working for that page, and i think that was because i didn’t go ahead and use the short links like i should have. i took the page down …
i hope i can still have data uris for christmas — better yet, sooner — if i am really really good?
LikeLike
Hi Female Faust,
Ah, you’ve noticed my tardiness in looking at your data URI request. Many apologies but been working on a new game, so Textise has been receiving a little less love for a few months.
It’s on the list though! Christmas? Well… maybe.
LikeLike
The address in error was:
https://www.law.cornell.edu/rules/frcp/rule_26
The error message is:
There is an error in XML document (293, 624).
LikeLike
Hi David,
I finally had time to give this a look. Unfortunately, I can’t find the exact cause of the problem. It seems like some sort of charset/encoding issue, which means the problem is with the server the page is on.
Ian
LikeLike
The address in error was:
http://www.nj.com/politics/index.ssf/2015/12/north_jersey_casino_plans_advance_despite_atlantic.html
The error message is:
The operation has timed out
This text-only page wa
LikeLike
Hi Matt,
This page seems to convert OK now. Perhaps Textise was very busy when you got the error. Try again and let me know if it’s working now.
LikeLike
It’s good now.
Thanks for your help;
keep up the good work.
LikeLiked by 1 person
The address in error was:
http://www.guelphmercury.com/news-story/6204656-guelph-police-investigating-after-discovery-of-a-body-in-beechwood-avenue-neighbourhood
The error message is:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.NullReferenceException: Object reference not set to an instance of an object. at textiseService.Service1.stripHTML(String strUrl, Int32 linkStyle, String guid) — End of inner exception stack trace —
LikeLike
Hi Tim,
That page is working for me. Maybe a temporary glitch. Please try again and let me know if it’s still throwing an error.
Ian (Textise)
LikeLike
The address in error was:
The error message is:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Exception: Textise web service could not access page. [Error description: The request was aborted: Could not create SSL/TLS secure channel.] at textiseService.Service1.stripHTML(String strUrl, Int32 linkStyle, String guid) — End of inner exception stack trace —
LikeLike
getting this more and more (timed out)
The address in error was:
https://theintercept.com/2016/01/13/al-jazeera-america-terminates-all-tv-and-digital-operations/?comments=1#comments
The error message is:
The operation has timed out
LikeLike
Hi,
Yeah, sorry – we’ve been having some performance problems the last couple of days and we’re investigating. In the meantime, I’ve just checked and the page converts fine at the moment.
Ian
LikeLike
Hi,
Haven’t been able to reach textise.net for 2 days now; it had been working for me until then. I’m in New York.
Googling “textise.net down” led me to http://notopening.com/site/textise.net.html, which says textise.net is up. However, if you click the “Check from Worldwide” button, it shows textise.net is available in Europe (UK, Netherlands and Germany) but not in the U.S. (Pennsylvania, Utah and California, and apparently New York).
Are you aware of any reason textise.net would be unreachable from the U.S.?
LikeLike
Hi Joe,
Sorry to hear you’re having problems with Textise. We have been having some DNS issues in the last few days so this probably explains it.
I checked out notopening.com and it does look odd, I agree. However, I notice that I get similar results when I check cnn.com – according to notopening.com it’s also not contactable from the States! So maybe that tool’s not the most reliable.
Please let me know if you’re still having troubles. It might take the rest of today for the latest DNS changes to propagate.
Ian
LikeLike
Hi, Ian.
Thanks for your prompt response and for digging a bit further. I hadn’t thought to check what notopening.com would say about U.S. sites. I tried it for CNN, Yahoo and Microsoft and it gave me the same worldwide response, saying not reachable from its U.S. nodes; I could reach all three. You’re right, it’s not reliable.
Right now, Textise isn’t available here, but I am getting CloudFlare’s page, which is a change from the last two days, when I was getting a plain browser timeout.
On a positive note, I want to thank you for providing Textise. I use it a lot from my smart TV, whose web browser crashes easily, seemingly on any even moderately sophisticated site. I also use it from my PC; it’s great for cutting out the multimedia “fluff” from many sites.
LikeLike
Hi Joe,
Thanks for the kind words – they help when things are going wrong!
Could I ask a favour of you? I don’t know which browser you’re using but could you please clear your cache and try again?
Thanks.
LikeLike
Hi, Ian.
Sorry for the (very, very) delayed reply. Textise came up for me after about twice as long as you predicted and since then, it has been pretty reliable. My theory about “twice as long” is that CloudFlare’s DNS also needed updating before I could get access. I’m not very familiar with server and DNS internals, and I don’t really understand exactly how CloudFlare fits in, so I won’t be surprised to be way off.
In answer to your question, if you are referring to the TV, I don’t know if I can clear the cache. The TV is a Samsung 5000 series. The browser’s user agent is “Mozilla/5.0 (SMART-TV; X11; Linux i686) AppleWebKit/534.7 (KHTML, like Gecko) Version/5.0 Safari/534.7”. The only relevant browser setting is “Delete browser data,” which I often do after the browser crashes. I know it deletes cookies, but I don’t know what else, if anything.
On my PC, I use Firefox and I have it set to clear cache on exit. I don’t leave the browser open (or the network connected) when I’m not on the web, so the cache gets cleared quite often. I also keep the cache small (25 MB).
Also in the mix here is my Verizon router; what it caches I don’t know.
If I run into another occurrence of an extended Textise outage, I’ll try “Delete browser data” on the TV and/or clearing the cache in Firefox and let you know what happens.
LikeLike
ian:
quick note: the textise button redirected me to “http://localhost:55896/showText.aspx?strURL=http%253A//[my urlencoded url]” — i fixed it,at first with your basehref instead of localhost, but keeping the funky port, then w/o the funk. just sayin.
ps
extra cred
sunday virtual rebus, a puzzle:
add these:
1)[android on star trek next generation];
2)[second pers. sing.];
3)[conjugation of ‘to be’ correlating to 2];
4)’an [—]’ ( for an —‘, as in, organ of sight, one of two);
5)dot;
6)dot directly below that dot, part of same symbol;
7)pick one:
a) funny you should ask, we’re adding that support this week
b) huh
c) what exactly do you see as the end result?
d) maybe if your reminders were just a bit more cryptic, i would have
been entertained enough to consider it
respect & appreciation for one of the most useful sites on the web,
ff
LikeLike
This is what happens when you let a developer migrate live code without proper testing.
I am that developer.
Sorry, fixed now.
LikeLike
Oh, and the answer’s well, not to be too cryptic about it (unlike SOME people!): c
Sorry – spent months now fighting off piratical hacker types trying to bring Textise down. Today, maybe, I made a breakthrough and can get on with my life again.
Glad you’re still finding Textise useful. Let me know how you’d like to use it for data urls.
LikeLike
at long last, sort of finished, or at least, at a stopping place:
the counterpart to my above comment,
which is itself a post, and which is made out of posts, dedicated to textise:
http://faustsstudy.blogspot.com/2016/04/textise-proof-of-theoretical-conceptual.html
LikeLike
The address in error was:
http://biblegateway.com
The error message is:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Exception: Textise web service could not access page. [Error description: The request was aborted: Could not create SSL/TLS secure channel.]
LikeLike
Hi Ron,
Thanks for reporting the problem. As the error message suggested, there was an issue with SSL. Should be fine now – if not, let me know.
Ian
LikeLike
The address in error was:
http://english.a222.org/سوالات-پایه-در-مصاحبه-انگلیسی1معرفی-مش/
The error message is:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Exception: Textise web service could not access page. [Error description: The remote server returned an error: (503) Server Unavailable.]
LikeLike
Hi Mahdi,
A 503 error means that the web site’s server is stopping Textise getting at the content. This might be a temporary problem or more permanent if a firewall rule is explicitly blocking Textise.
The only answer is to contact the web site directly and ask them to remove the block.
Ian
LikeLike
successfully got https://www.textise.net/showText.aspx?strURL=https%253A//www.thesun.co.uk/news/2915352/russia-nuclear-bombs-arctic-radioactive-particles-europe/
then clicked on “respected websites” url is javascript:textise(‘https://theaviationist.com/2017/02/19/u-s-air-force-deploys-wc-135-nuclear-sniffer-aircraft-to-uk-after-spike-of-radioactive-iodine-levels-detected-in-europe/’) and got this error message:
———————-
Server Error in ‘/’ Application.
Length cannot be less than zero.
Parameter name: length
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length
Source Error:
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Stack Trace:
[ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length]
System.String.Substring(Int32 startIndex, Int32 length) +13741876
textise.showText.Page_Load(Object sender, EventArgs e) +13549
System.Web.UI.Control.OnLoad(EventArgs e) +109
System.Web.UI.Control.LoadRecursive() +68
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +4498
Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.6.1087.0
———————————————-
LikeLike
Hi FF,
I thought you’d been quiet for a while! Nice to hear from you. 🙂
Thanks for reporting this. It’s a bit of a knotty one, to be honest. It did expose a problem with Textise, which I’ve now fixed, but this only means you now get a tidier error message, I’m afraid.
The target page’s server is returning garbage to Textise. I’m not sure why this is, perhaps an encoding thing. One thing I can tell you is that their server’s set up to reject calls from bots and/or Textise in particular. To prove this, you just need to try to Textise https://www.theavationist.com – you’ll get a 403 Forbidden.
Sorry I can’t help more.
Ian
LikeLike
Link:
https://www.quotev.com/story/7538364/Undertale-Songs/101
Error:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.ArgumentOutOfRangeException: Index and length must refer to a location within the string. Parameter name: length at System.String.Substring(Int32 startIndex, Int32 length)
LikeLike
Hi,
Thanks for reporting the error.
The page converts fine when I run Textise locally, so my conclusion is that the target page’s server is rejecting calls from textise.net.
If this is the case then the only answer is to contact the web site and ask them to take the firewall rule off.
Sorry I can’t be more helpful.
Ian
LikeLike
Not an error as such, but problem with foreign alphabet pages. For example
https://www.calcalist.co.il/articles/0,7340,L-3722164,00.html
originally in Hebrew, come out as gibberish; snippet:
[Image: ××ש×ת ××× ××××××]
××× ××××× ××ש×ת ××× ×××××× Play ×פ××קצ××ת ××ש×ר×× IT תקש×רת ×××××§ ×××× ×¡×××× ×¡×××ר
[Image: RSS] RSS ××× ×××××|×רש××ת ×× ×-RSS
And this page;
http://www.nrg.co.il/online/1/ART2/896/748.html
comes out just empty.
In many other cases, pages in Hebrew textise just fine.
Jack
LikeLike
Hi Jack,
Many thanks for reporting these errors.
Character sets can be tricky beasts so I’ll have to do some digging to see what’s going on. In the meantime, could you let me have the URL of a Hebrew page that renders OK, so I can compare the meta data?
The second error (no content at all) looks like a different problem – I’ll have a look at that too.
Ian
LikeLike
Sure thing. Here’s one that renders OK, though it displays the text left-justified instead of right-justified (Hebrew – and Arabic – being right-to-left alphabets): http://www.maariv.co.il/news/israel/Article-601423.
And for good measure, here’s a page in Hindi (Devanagari script, left-to-right) that renders fine: https://www.sabguru.com/markandey-katju-mai-baap-bihar-asks-nitish-kumar/.
LikeLike
The address in error was:
http://paterikiparadosi.blogspot.gr
The error message is:
System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Exception: MAXIMUM CONTENT LENGTH EXCEEDED. In order to maintain a high quality experience for all of our users, Textise operates a fair use policy. Under this policy, there is a 999999 byte limit to the size of a page that can be processed
LikeLike
Hi,
Thanks for reporting this. The page converts OK for me so could you please try again?
If you’re still having difficulties, it might be a geographical problem, I guess: perhaps at your location more content is served, which breaks the maximum length limit.
LikeLike
Textise no longer working with IE. Links do not work, and the Textise/ Search buttons do not work
LikeLike
Hi,
Yes, this does seem to be the case. With apologies for the inconvenience, I think it’s a recent security update that’s caused the problem. I’ve made a pretty determined effort to tweak the IE11 compatibilty settings on my own machine but so far it still refuses to play nicely.
Internet Explorer has pretty much gone down the legacy road now, although obviously some large organisations are still stuck with it.
Is it possible for you to use a different browser? Chrome and Firefox work well.
LikeLike
IE11 is now playing nicely again with Textise. I have no idea what changed …
LikeLike