You are here: PSPad forum > English discussion forum > converting word files to html

converting word files to html

Goto Page: 1 2 Next

#1 converting word files to html

Posted by: scott | Date: 2006-08-09 15:52 | IP: IP Logged

Hello,
I have some word documents (word 2002) that I saved as RTF files and did the "Import from RTF" menu item which works mostly but any commas or quotations I have in the word file turn into question marks when I save the html file and upload it to the web. If I do the "HTML Page Preview" menu item it looks correct. But when I upload it and view it through IE it is showing question marks. Any help would be great. Thanks

Options: Reply | Quote | Up ^


#2 Re: converting word files to html

Posted by: pspad | Date: 2006-08-09 16:42 | IP: IP Logged

Code page you used for file save ?
Did you set correct charset in HTML header ?

Options: Reply | Quote | Up ^


#3 Re: converting word files to html

Posted by: scott | Date: 2006-08-09 17:54 | IP: IP Logged

"Code page you used for file save ?" - Sorry, I don't understand what you are asking.

"Did you set correct charset in HTML header ?" - My guess is no because I'm not familiar how to do this. Is this in Word or PSPad?

thanks.

Options: Reply | Quote | Up ^


#4 Re: converting word files to html

Posted by: MrSpock | Date: 2006-08-09 20:31 | IP: IP Logged

Every character you send across to another computer must be encoded somehow. This takes place according to a set af rules which together form an encoding or a character set, charset for short.
In most cases, the charset used in a plain text document (like an HTML file) cannot be read off the document itself and hence must be explicitly stated in the so-called "header" section of that file.

If your document is in some Western-European language (English, Dutch, French ...) and your Windows Regional Setting are set to that same language, the ISO-8859-1 encoding will most likely be a good choice. That is, your document should start similar to this:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
<meta name="author" content="Scott">
<title>My interesting document</title>
</head>
<body>
Content goes here.
</body>
</html>

If you post a link to your HTML file here, somebody will try to help you figure out what's the best choice in your case.

Second thoughts: you could try to run HTML Tidy over your HTML file inside PSPad (Ctrl+F10) and read its output in the log window. It might give you a good clue what's going wrong.

Edited 1 time(s). Last edit at 2006-08-09 20:36 by MrSpock.

Options: Reply | Quote | Up ^


#5 Re: converting word files to html

Posted by: scott | Date: 2006-08-11 15:28 | IP: IP Logged

I tried changing the character set like you said, did not make any notible difference.

I also ran the tidy and it found these warnings.
Found 0 errors 4 warnings
line 12 column 1 - Warning: missing </span> before <div>
line 15 column 1 - Warning: inserting implicit <span>
line 63 column 1 - Warning: discarding unexpected </span>
line 12 column 1 - Warning: trimming empty <span>

If I fix them it does not seem to make any difference.

The link is:
agency.peninsulainsurance.com

Thanks.

Options: Reply | Quote | Up ^


#6 Re: converting word files to html

Posted by: carbonize | Date: 2006-08-11 19:38 | IP: IP Logged

I am curious as to why you don't use Word to make it into a web page? Then just edit it in PSPad.

--
Carbonize

Options: Reply | Quote | Up ^


#7 Re: converting word files to html

Posted by: MrSpock | Date: 2006-08-13 18:51 | IP: IP Logged

Yes, I think exporting to html from Word would be the easiest way. I've looked into your HTML file converted from RTF and it looks like there was some mixup with the encoding when it was opened with PSPad in the first place. There is no easy way to fix this AFAIK.

BTW If you have to edit HTML files on a regular basis and want a decent WYSIWYG editor, why not try Nvu (http://www.nvu.com/)? It's free and relatively easy to learn for people who have some experience with Word, but don't want to dig into HMTL too deeply.

Options: Reply | Quote | Up ^


#8 Re: converting word files to html

Posted by: scott | Date: 2006-08-14 20:59 | IP: IP Logged

Well I actually started out by creating my document in word and using the "save as webpage" option to create a html file. But I got the same results (' turning into ?), so when I found this program for editing and it had the option for turning importing RTF I thought that might solve it. But it gave me the same results. Weird thing is once I import it into pspad if I go and find the ' and delete it and re-insert it, then save it and upload it works fine. So I could just do a mass search and replace but I thought I probably wasn't the only one having this so I figured I would throw it out there for people to look at.

Thanks for the link to nvu, that looks pretty nice. I'll probably give it a try.

Options: Reply | Quote | Up ^


#9 Re: converting word files to html

Posted by: yearmore | Date: 2013-08-23 08:09 | IP: IP Logged

word is a very easy editing documents compared with others. here is one way to convert word to html c# you can give a try. even i can't figure out your problem, you still can find ways to solve.

Options: Reply | Quote | Up ^


#10 Re: converting word files to html

Posted by: carbonize | Date: 2013-08-23 09:14 | IP: IP Logged

Smells like spam to me

--
Carbonize

Options: Reply | Quote | Up ^


Goto Page: 1 2 Next





Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR