Encoding MSWord smart quotes

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • paulC
    Junior Member
    • Mar 2004
    • 20

    #1

    Encoding MSWord smart quotes

    Any suggestions on how to deal with form input (PHP and MySQL) that includes MSWord smart quotes? One of the sites I run has a conference presentation submission form, lots of the people submitting proposals write them up in Word, paste them into the form box, and submit, with the end result being that the smart quotes (and em dashes, etc) come out garbled. I've tried using a PHP function to convert the MS content to standard web-safe entities, but it doesn't seem to be working. What can I do to get around this??
  • Buddha
    Senior Member
    • Mar 2004
    • 825

    #2
    Check out: http://jon.hedley.net/convert-ms-word-to-plain-text

    Nice little Javascript (embedded) on that page.
    "Whatcha mean I shouldn't be rude to my clients?! If you want polite then there will be a substantial fee increase." - Buddha

    Comment

    • sdjl
      Senior Member
      • Mar 2004
      • 502

      #3
      I had this problem with some of the users on my blog site. They like to use word and then submit a post, which is fine, unless you want to use RSS and then it goes pear shaped.
      I made this little piece of code up to change the formatting when a post was inserted. It works well enough for me, so i can't see why it wouldn't work for you

      PHP Code:
      // Remove MS word formatting
      $writing str_replace("’""'"$writing);
      $writing str_replace("‘""'"$writing);
      $writing str_replace('“''"'$writing);
      $writing str_replace('”''"'$writing);
      $writing str_replace("…""..."$writing); 
      David
      -----
      Do you fear the obsolescence of the metanarrative apparatus of legitimation?

      Comment

      • -Oz-
        Senior Member
        • Mar 2004
        • 545

        #4
        What david does is exactly what i do. So go with that.
        Dan Blomberg

        Comment

        • sdjl
          Senior Member
          • Mar 2004
          • 502

          #5
          It's pretty efficient if you're inserting the data into a text file or database.
          I also encode £ (pound signs) into their ASCII counterpart. For some reason, RSS (or is it XML?) can't deal with them. Most odd

          David
          -----
          Do you fear the obsolescence of the metanarrative apparatus of legitimation?

          Comment

          • anguz
            Member
            • Mar 2004
            • 47

            #6
            David, so RSS can't deal with ASCII? What about HTML entities? I've never used RSS before, so I'm not familiar.

            About the replace, I'd probably write that like this

            Code:
            strtr($txt, array('?’' => "'", '‘' => "'", '?“' => '"', '?”' => '"', '?…' => '...'));

            Comment

            • sdjl
              Senior Member
              • Mar 2004
              • 502

              #7
              Oh no, it can handle ASCII characters, it just can't handle their literal character.
              So £ as it shows won't work, but it's ASCII representation will work fine.
              -----
              Do you fear the obsolescence of the metanarrative apparatus of legitimation?

              Comment

              • paulC
                Junior Member
                • Mar 2004
                • 20

                #8
                Thanks for the suggestions. I used a function from the php website in the end, to achieve something similar to what David suggested. The page's html encoding seems to be important, too.

                Comment

                Working...