Timothy Jenney Posted July 31, 2010 Share Posted July 31, 2010 The most sophisticated importing feature in Accordance is available for HTML files (.htm extension). I regularly use Word, so converting .doc or .docx file types to .htm is fairly easy. I just use Word 2008 and "Save as Web Page...." The only problem is that Word inserts carriage returns [CR] after each line in order to preserve the layout of the document. That's fine for a web page, but a problem in Accordance. That's because User Tools have re-sizeable windows, so we want the text to flow properly as the window resizes. Currently Accordance (8.4.7) [correctly] removes carriage returns. Unfortunately, it does not insert a space in their place. [i've been assured the next update to the program will do so.] In the interim, users will have to manually insert a space between the last word on the line and the first word on the next line. Another option is available though. That's to use a text editor like TextWrangler ( http://www.barebones.com/products/textwrangler/ ). Use the Find command (Cmd-F) and follow the instructions on "TextWrangler Setting.png" below. It will substitute spaces for all the carriage returns in your document. Make these changes, save your document, and then convert it in Accordance. It will work perfectly. Link to comment Share on other sites More sharing options...
Tom Childers Posted July 31, 2010 Share Posted July 31, 2010 Thank you for addressing and explaining this problem of the dropped spaces when creating user tools. I have mentioned this on the forum before and it seemed to me that I was the only person experiencing this problem. I resorted to searching for single spaces and replaces with double spaces. While this added a lot of extra spaces, at least the words did not run together. Thanks for explaining very clearly what is taking place and offering a solution. As always, your work with Accordance is greatly appreciated. Tom Link to comment Share on other sites More sharing options...
Timothy Jenney Posted July 31, 2010 Author Share Posted July 31, 2010 (edited) Accordance does not currently offer direct import from .pdf documents. That's a shame, as more and more .pdf documents are becoming available. Happily, there are now a number of .pdf to .htm converters available for Mac users. I've found the following programs, but have not had a chance to evaluate them yet: deskUNPDF: http://www.docudesk.com/deskunpdf_product_home.shtml [free trial download] SOLID PDF to Word for Mac: http://www.mac-pdf-converter.com/ [free trial download] recosoft ( http://www.recosoft.com/ ) offers a suite of products including both conversion to Ms Office and Apple iWork formats. If you just need a single document converted, there are a number of web sites currently offering the service for free. [i'll let you do your own search for those.] If you've found a good converter I haven't mentioned, or had experience with any of these products, feel free to chime in with your thoughts. Edited July 31, 2010 by Timothy Jenney Link to comment Share on other sites More sharing options...
Timothy Jenney Posted July 31, 2010 Author Share Posted July 31, 2010 Thank you for addressing and explaining this problem of the dropped spaces when creating user tools. I have mentioned this on the forum before and it seemed to me that I was the only person experiencing this problem. I resorted to searching for single spaces and replaces with double spaces. While this added a lot of extra spaces, at least the words did not run together. Thanks for explaining very clearly what is taking place and offering a solution. As always, your work with Accordance is greatly appreciated. Tom Thank you, Tom. I confess I am just passing this information along from David Lang and Rick Bennett, both of whom are far better at this sort of thing than I am. Reminds me of a good quote I heard during my doctoral work, "If I have seen further than others, it's because I have stood on the shoulders of giants!" Link to comment Share on other sites More sharing options...
Tom Posted August 1, 2010 Share Posted August 1, 2010 If you've found a good converter I haven't mentioned, or had experience with any of these products, feel free to chime in with your thoughts. Here is one that converts from PDF to Word. It is free until August 8th (normally $39.95): http://www.anypdftools.com/pdf-to-word-for-mac.html I have played with it a little. Seems to do a nice job. PDF -> Word, then you can export it from there . . . Tom Link to comment Share on other sites More sharing options...
Helen Brown Posted August 1, 2010 Share Posted August 1, 2010 I agree, this seems to be working well, and should be a helpful tool for those who want to convert a PDF into a User Tool. Note: the free offer is really nice, but read the letter with the registration code carefully to save yourselves the hassles I had from not doing so. You register with their email address, not yours, and you right-click to paste the text into the box. Link to comment Share on other sites More sharing options...
Timothy Jenney Posted August 5, 2010 Author Share Posted August 5, 2010 This podcast is now posted. It covers importing plain texts, html and TLG directly into a User Tool, rounding out our three-part series on "Making a User Tool." I hope we will see many more of our users making their work available to others on the Accordance Exchange: http://www.accordancefiles1.com/exchange/ Link to comment Share on other sites More sharing options...
JSGilliom Posted August 6, 2010 Share Posted August 6, 2010 Thank you for your Podcasts. I've really enjoyed the ones I've been able to watch. I just finished watching this episode, Making a User Tool Pt. 3. Importing. It looks like the text converted back to Yehudit got swapped directions one too many times. Is there an easy way to correct this or prevent it from happening? Thanks, John S Gilliom Link to comment Share on other sites More sharing options...
Timothy Jenney Posted August 7, 2010 Author Share Posted August 7, 2010 Thank you for your Podcasts. I've really enjoyed the ones I've been able to watch. I just finished watching this episode, Making a User Tool Pt. 3. Importing. It looks like the text converted back to Yehudit got swapped directions one too many times. Is there an easy way to correct this or prevent it from happening? Thanks, John S Gilliom Thanks so much, John! It's all about the interaction between Accordance and the specific word processing software used in the original document. I'm one of the few people here at Accordance that use Word. One method I've been experimenting with is to keep the original document open in Word, then using copy and replace for particularly problematic words. As you may know, Hebrew is a particularly difficult language for most word processors. Many of the folks here at Accordance recommend Mellel (http://www.redlers.com/ ) for that reason. Link to comment Share on other sites More sharing options...
Outis Posted August 19, 2010 Share Posted August 19, 2010 Hello Dr. J (et al), Two things: 1) Text wrangler is a good piece of software. However, I upgraded to BBEdit long ago just exactly for stuff like this. Not only can you get to remove carriage returns and stuff, but it also cleans up plain text documents and turns them into decent html code. 2) It has been a year or so since I used Accordance to import from html. The problem I ran into is that there is no support for tables and indented text (either in indented block paragraphs in outline style or in nested lists). Have these issues been addressed? I love the user tool feature. But, until it's able to make use of basic tables and indentations, it remains for me a tool I can use with some documents, but not with most documents (because they almost need to have these two features to look right). If you'd like to see what I'm talking about in detail, you can download the attached zipped file. The html file looks ok. But the imported user tool doesn't look right. The borders on the table don't exist and the spacing is off. Likewise the nested bullets are flattened out. Like I said, I love the user tools. I'd just like to see the user tools mature into a more usable feature for Accordance. thanks, DS2-HYMNS.zip Link to comment Share on other sites More sharing options...
Helen Brown Posted August 19, 2010 Share Posted August 19, 2010 We do plan to improve the importing, though busy with other features right now. I cannot promise that we'll include tables though. HTML is such a messy and variable format that it is a major challenge to import it. Just look at all the attributes that it can give tables! Link to comment Share on other sites More sharing options...
Outis Posted August 19, 2010 Share Posted August 19, 2010 Good morning Helen, Html is messy. And the tables are probably one of the messiest parts. I'll grant you that. However, it would be good to have the ability to make some basic tables without any frills or thrills. When your people get around to it, it would be good to start a discussion on some basic standards to adhere to, so that we know what to make the tables look like in html before we import them into Accordance. Thanks. Link to comment Share on other sites More sharing options...
Helen Brown Posted August 19, 2010 Share Posted August 19, 2010 I think that most of our users wouldn't know how to edit an HTML file to conform to basic standards. They either download them or export to HTML from another format. Link to comment Share on other sites More sharing options...
Timothy Jenney Posted August 19, 2010 Author Share Posted August 19, 2010 Outis, I do want to remind you that you can insert a link to a file on your hard drive in a user tool. That link can open a table or photo Link to comment Share on other sites More sharing options...
mikes Posted August 28, 2010 Share Posted August 28, 2010 For those of us who have large numbers of documents in Mellel format, is there a known way to convert a .mellel to a .htm for user tool import? Link to comment Share on other sites More sharing options...
Helen Brown Posted August 28, 2010 Share Posted August 28, 2010 Your could export to RTF and then open in TextEdit and save as HTML. 1 Link to comment Share on other sites More sharing options...
Fabian Posted May 30, 2014 Share Posted May 30, 2014 Hello I know this topic is a little bit older, but I have some problems. Mac OS 10.9.3, Accordance 10.4.2.1, MS Word 2011 14.4.2 When I copy text from the Internet to MS Word, save it as .htm with UTF-8 and import it to a user tool in Accordance then I have: 1. that some space between the words are missing e.g. Jesuslovesyou (not always but often). 2. some Umlaut and ß for the german ss and other special character like ñ, ń, etc. are missing. e.g. Matthus instead Matthäus (always) Anther thing: I have also often problems to copy txt text in the user tool with Umlaut etc. then I have to search and replace all the ä, ö, ü, Ä, Ö, Ü, ß etc. Does Accordance or another user have some solutions for that problems? Were it possible to make .xml to import or has that other problems? Greetings Fabian Link to comment Share on other sites More sharing options...
Joel Brown Posted May 30, 2014 Share Posted May 30, 2014 Fabian, in order to have Accordance recognize the file as UTF8, be sure to include this line in the html header: Link to comment Share on other sites More sharing options...
Steve King Posted May 30, 2014 Share Posted May 30, 2014 In order to solve the space problem I open the htm file in TextWrangler (which is a free download) then from the 'Text' menu I select 'Remove Line Breaks'. After you save the file import into Accordance and spaces should be OK. Link to comment Share on other sites More sharing options...
Gedalya Posted May 30, 2014 Share Posted May 30, 2014 Dr J, Can you provide a link for the latest podcast user tools pt 3? Link to comment Share on other sites More sharing options...
Fabian Posted May 30, 2014 Share Posted May 30, 2014 Hi Joel I see this in my header when I save as .htm and after go to the html source in MS Word. <meta http-equiv=Content-Type content="text/html; charset=utf-8"> its not the exact the same like your. If I change to your tag then It looks a little better:-) see the original: Test lstuwvao öäü ßßßßßß (((( )))) ≪≪ ≫≫ < y <> ñ and from the user tool: Test lstuwvao öäu ffffff (((( )))) j"j" k"k" < y <> Ò äöü comes good but the rest not. It is possible that Accordance can handle what MS Word or Libre Office does tagged? MS Word <meta http-equiv=Content-Type content="text/html; charset=utf-8"> So we not have to change always the tag. Thanks for your help. Link to comment Share on other sites More sharing options...
Fabian Posted May 30, 2014 Share Posted May 30, 2014 No sorry the problem is still there. I've seen that it has it not saved with your solution. MS Word give then an alert. It has it changed from utf-8 to unicode. Is it possible that the different language in the header is a problem? Link to comment Share on other sites More sharing options...
Joel Brown Posted May 30, 2014 Share Posted May 30, 2014 Fabian, do you mind uploading your html file so we can take a look for ourselves? Also, if you are just copying and pasting, why not copy and paste into the user tool directly? Link to comment Share on other sites More sharing options...
Fabian Posted May 30, 2014 Share Posted May 30, 2014 Hello Joel 1. At the moment I can only a few lines copy into a user tool. Even if the text is longer, a bug too. see below2. with copy over .docx or .txt it comes not correct too, look at this ⬇︎ the Umlaut (and also the ß for the german sharp S comes as fi ) A [1][1] 1. A, a, der erste Buchstabe des lateinischen Alphabets. Als Abk¸rzung: 1) = der Vorname Aulus. 2) = Antiquo (ich verwerfe den neuen Vorschlag), auf den Stimmtafeln in rˆm. Volksversammlungen. 3) = Absolvo (ich spreche frei), auf den Stimmtafeln der Richter; dah. A gen. littera salutaris bei Cic. Mil. 15. 4) vor Zahlen, Jahresbezeichnung more than this goes not at once. And sometimes only a few character 3. copy direct loose the format (bold, italic, h1, links) 4. So it were be nice if it goes over MS Word as .htm over TextEdit or Libre Office it goes not, because they save it as .html and Accordance can't recognize it. Joel, I'm not permitted to upload a .htm file in the basic and advanced uploader. Link to comment Share on other sites More sharing options...
Joel Brown Posted June 12, 2014 Share Posted June 12, 2014 Fabian, We've had a chance to look at your HTML file. The header used there is fine - Accordance is recognizing it as UTF-8. We've found two bugs in the import. First, it was stripping line breaks that weren't marked with a , rather than turning them into spaces. This caused your words running together. Second, if it encountered one unicode character it couldn't handle (ā) it would skip all of the other unicode characters it can normally handle (like ß or greek characters). Both of these bugs have been fixed for the next release, so you should experience a much smoother job soon. Link to comment Share on other sites More sharing options...
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now