I am trying to use wp_post_insert function to create posts/pages from my HTML pages.
So far I’ve got:
$html = file_get_contents($dir . $data['url']);
$post = array(
'post_content' => $html,
'post_name' => sanitize_title_with_dashes($data['title']),
'post_title' => $data['title'],
'post_status' => 'publish',
'post_type' => $data['type'],
);
$post_id = wp_insert_post($post);
However this does not seems to be working as post_content is always empty. I’ve tried different sanitation functions, but none of them are working.
HTML code is read into $html variable, but I guess the problem comes because its multiple lines.
Can you give me a hint?
Ive found that the files that I am reading were encoded with UCS2 encoding – file_get_contents returns special characters which looks good in browser (browser has its own character encoding), but not good for WP.
The solution was to reencode HTML files into UTF-8 ones.
Trick the function into thinking that the input has already been sanitized with the
filter
attribute: