Creating post content from external web scraper via JSON or RPC

I’m looking for a way to post content from a web scraper external to our WordPress site to post content its scraped into our site. Right now, I have the scraped content formatted into JSON. Could I somehow use the JSON API to post that json formatted data into a WP site, or would I need to use the XML-RPC approach?

Edit: I’m actually trying to help a scattershot writing community by bringing together various poems and stories published over twenty different websites run by the writers individually. They would like a site showcasing their work as a whole. No spamming is taking place here….

Read More

The scraper, which is written in Python (using a framework called Scrapy), can output what it scrapes in JSON. Let’s say, I have a title and description and those are outputted as title: Story Name and description: This story is about this. I want to then post those two bits of data into WordPress, on a different server, as a post type of Story. I’m asking, since my research shows that the XML-RPC methodology may be old, if the newer JSON API supports this and where I could find a good example. I’ve been looking for examples but so far haven’t found any.

Edit: So it seems that coupled with the downvotes and the fact that I failed to identify myself as a non-spammer, this thread might be going the way of many threads in the open source community…however, for the sake of anyone trying to go down the same path as me, who are simply looking for working examples of how to use the JSON API to post content, I did find this discussion. So far I’ve been able to create a nonce, which is hopeful. now I think I need to write a controller for my next step so I can actually post content of type post:

http://wordpress.org/support/topic/plugin-json-api-how-to-add-a-comment-or-post

This is the url I tried after generating a nonce for myself… I got an error:

http://mysite.com/?json=post.create_posts&nonce=‘5d3f89d00e’&title=’testingpost’&content=’this%20is%20mypost%20stuff’&status=publish

Which currently returns: {“status”:”error”,”error”:”Unknown controller ‘post’.”}

Related posts

2 comments

  1. Since you’re already using Python, I recommend grabbing a Python XML-RPC library and building a request into your scraper. We actually added support for creating new custom post type entries in WordPress 3.4.

    If it helps, the documentation for Python’s XML-RPC library is freely available. As is the documentation for the WordPress wp.newPost rpc call.

    Just specify a post_type when you build your post’s content struct and it will be created as you expect it to within WordPress.

Comments are closed.