What would the best way to import more than 30,000 questions and answers?

7.36K viewsGeneral
1

Hi,

I want to import a huge database of around 30,000 questions and answers.

I have been trying with following code

https://gist.github.com/rahularyan/7b85c9678ddef2c21208

But, it’s causing meta problems (like even after providing user-id activity log says Anonymous answered…, Question/Answers count and reputation points are not updated).

Also I have read your answer on the question Importing from Django

I tried with merging the two, but even that produces some bugs (e.g. questions/answers count, reputation points etc, activity line).

Also the biggest problem I am facing right now, is to import such a large database. It appears to be taking forever to import 30,000 questions, the actual number of queries to be run would be a lot higher, I think (first, 30k questions, then 30k answers, and their meta-entries) .  I have even increased my localhost timeout.

Can you please suggest me –

1. How should I modify this code or what would be the correct code to import all questions and answers from a linear database (questions and answers are in the same row)?

2. What can be done to increase import speed?

Is this q2a from where you are importing ?

No. It’s not q2a. it’s my custom Q&A database created for multiple choice quiz.
and i have customized it for WordPress-Anspress by having columns such as Question_Title(Post_Title), Question_Content(Post_Content), Answer_Content(Post_Content).

When you used to work on q2a themes, I used a similar code to import Q&A to q2a from this database. But back then, it used have approx 5k questions and answers, even then it took me whole day to import (and some days to understand what are other meta I have to include in my custom script). So I am guessing this time it would take days if I blindly do hit and try.

0

So Rahul, I may try using the wp_set_current_user instead of attributing to one user like atultiwari did. How would you set it up in the code atultiwari created above. I get the idea behind the function but I don’t understand how it would be implemented. Not sure how the meta conflicts.

i think, you should set the current user id (from function) in while loop (while ($row = mysql_fetch_array($qry_qbank)) ) where it would pick up question’s / answer’s user id from your database. But I have not tested it. neither I am programmer, So, either make a hit and try or better wait for Rahul to answer it

Gotcha. Yeah i’ve been trying guess and check but its been really frustrating so far because I can’t get it to work right. But this has helped A TON. That makes sense though to put in the loop.

0

Dang, I’ve been trying to import my database to AnsPress but keep getting stuck. So this is super helpful. Thank you.

Yes thanks to Atul, don’t forget to add https://codex.wordpress.org/Function_Reference/wp_set_current_user else after answer and after question hook will not run

2 things to note, which the current code is missing –
1. (if you have large db to import) increase your php execution time
2. implement “wp_set_current_user” to set the specific user id for different questions and answers 🙂 [else if there is any difference between currently logged in user id (at the time of import) and user id entered in code, there will be anspress meta issues]

Thank you both. I didn’t see your replies before the post I made after. Atul, how long did you need your php execusion time set to? I imagine really long for it being an entire day. I have ~3000 Q/A’s to import. Still unsure where to place the wp_set_current_user.

I first tried with “max_execution_time = 5000” in my php.ini of wamp server. It allowed me to import only approx 2700 Q (+ 2700 A), But I had to import approx 30k Q/A, so I increased this time to 50,000 and for the safe side, I also added “set_time_limit(0);” in top of my import.php file.

Thanks Atul. Very helpful. I will probably shoot for 8000 max.

@Stephen, I just did a successful (i think) trial with “wp_set_current_user”, with almost all meta importing correctly. Unfortunately, I can’t post answer to my own question 🙁 I am uploading code to sample to git, and will post link shortly

Thats awesome. Thank you.

here is the gist link –
https://gist.github.com/atultiwari/cd9b5d00e54c238a02d1

i think i might write about pros/cons/pre-requisite/limitation/bugs of this piece of code in a separate post.. hard to follow in comments.

0

You can initiate you code after WP init hook and everything will as usual.

Create a page in wordpress called import, and then create a new file called page-import.php inside active theme dir and paste your code without:

require('wp-blog-header.php');

and just visit that page and import will be started.

Thanks. I have finally imported my Q&A. I also found 2 things –
1. I was actually using the mentioned code in wp root folder as it is posted above, including “require(‘wp-blog-header.php’);”. The page-import,php method does the very same thing. Meta issues were not caused by my code from wp-root.
2. Meta issues were occurring because anspress meta code uses currently logged in user only, while I was providing user id in my code itself. So both were conflicting.

There were some minor issues in this importing process, but I am OK with that. Although about a few major issues, I will have to post separate questions.

Oops… I came to know about this function little late.. 🙁 It took me whole night to import all Q&A to local wp under single user name only. and earlier this morning I have also migrated the same to my live site too. I don’t see any necessity to rewrite the code and again spend a night for import…इतनी ज्यादा programming की तो लोग मेरे नाम के आगे का Dr. हटा के Er. कर देंगे. thanks anyways.

HAHHAH 🙂 does happen