I have some users who are posting to a group blog and are able to cut-and-paste but their pastes include things like:
<!â /* Font Definitions */ @font-face {font-family:âCambria Mathâ; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-520092929 1073786111 9 0 415 0;} @font-face {font-family:âTrebuchet MSâ; panose-1:2 11 6 3 2 2 2 2 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:647 0 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:â"; margin-top:0in; margin-right:0in; margin-bottom:10.0pt; margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:12.0pt; font-family:âTrebuchet MSâ,âsans-serifâ; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-bidi-font-family:âTimes New Romanâ; mso-bidi-theme-font:minor-bidi; color:black;} p {mso-style-noshow:yes; mso-style-priority:99; mso-margin-top-alt:auto; margin-right:0in; mso-margin-bottom-alt:auto; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:âTimes New Romanâ,âserifâ; mso-fareast-font-family:âTimes New Romanâ;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:12.0pt; mso-ansi-font-size:12.0pt; mso-bidi-font-size:12.0pt; mso-ascii-font-family:âTrebuchet MSâ; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:âTrebuchet MSâ; mso-bidi-font-family:âTimes New Romanâ; mso-bidi-theme-font:minor-bidi; color:black;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:10.0pt; line-height:115%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} â>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:âTable Normalâ;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:â";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:âCalibriâ,âsans-serifâ;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:âTimes New Romanâ;
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:âTimes New Romanâ;
mso-bidi-theme-font:minor-bidi;}
What can I do to filter out code like this automatically?
There is a button on the visual text editor built in to WordPress that will strip Microsoft Word formating. It is labeled “Paste From Word”
I’d suggest using Ozh’s TinyMCE Advanced plugin. It lets you add a ‘Paste from Word’ option that takes care of all of that for you.
However, if you’re not interested in that, you have a few more options. Like this:
Just keep adding undesirable declarations to the first capturing set in that regex to add lines that should be removed. E.g.:
(mso|panose|other-junk|annoyance)
.I’ve worked with clients who face this problem a lot. The trick, I’ve found, is to copy-paste into HTML view and then switch back to the Visual editor to tweak formatting if necessary.
This is also necessary if copy-pasting from another website. Sometimes you’ll accidentally pull in class definitions and in-line styling from the external source and that can break the display if you don’t have those same classes or styles set up or supported by your site.
Another option would be to expose your users to Windows Live Writer. It’s a Microsoft product that’s completely free, plays nicely with copy-paste from Word, and can interact with WordPress – you write your post, edit your post, use the built-in spell checker, format the post to display exactly how you want, then click “Publish” to push your post to WordPress via XMLRPC. It’s a fairly sound system that makes it incredibly easy to teach a first-time blogger how to blog … particularly because the UI is so similar to Word to begin with.
For anyone looking for a solution to this problem, I did something like this:
You can replace the strings in the delete_between function with whatever you want. That seemed to work for me though.