What would the best way to import more than 30,000 questions and answers?

6.59K viewsGeneral
1

Hi,

I want to import a huge database of around 30,000 questions and answers.

I have been trying with following code

https://gist.github.com/rahularyan/7b85c9678ddef2c21208

But, it’s causing meta problems (like even after providing user-id activity log says Anonymous answered…, Question/Answers count and reputation points are not updated).

Also I have read your answer on the question Importing from Django

I tried with merging the two, but even that produces some bugs (e.g. questions/answers count, reputation points etc, activity line).

Also the biggest problem I am facing right now, is to import such a large database. It appears to be taking forever to import 30,000 questions, the actual number of queries to be run would be a lot higher, I think (first, 30k questions, then 30k answers, and their meta-entries) .  I have even increased my localhost timeout.

Can you please suggest me –

1. How should I modify this code or what would be the correct code to import all questions and answers from a linear database (questions and answers are in the same row)?

2. What can be done to increase import speed?

Is this q2a from where you are importing ?

No. It’s not q2a. it’s my custom Q&A database created for multiple choice quiz.
and i have customized it for WordPress-Anspress by having columns such as Question_Title(Post_Title), Question_Content(Post_Content), Answer_Content(Post_Content).

When you used to work on q2a themes, I used a similar code to import Q&A to q2a from this database. But back then, it used have approx 5k questions and answers, even then it took me whole day to import (and some days to understand what are other meta I have to include in my custom script). So I am guessing this time it would take days if I blindly do hit and try.

0

Dang, I’ve been trying to import my database to AnsPress but keep getting stuck. So this is super helpful. Thank you.

Yes thanks to Atul, don’t forget to add https://codex.wordpress.org/Function_Reference/wp_set_current_user else after answer and after question hook will not run

2 things to note, which the current code is missing –
1. (if you have large db to import) increase your php execution time
2. implement “wp_set_current_user” to set the specific user id for different questions and answers 🙂 [else if there is any difference between currently logged in user id (at the time of import) and user id entered in code, there will be anspress meta issues]

Thank you both. I didn’t see your replies before the post I made after. Atul, how long did you need your php execusion time set to? I imagine really long for it being an entire day. I have ~3000 Q/A’s to import. Still unsure where to place the wp_set_current_user.

I first tried with “max_execution_time = 5000” in my php.ini of wamp server. It allowed me to import only approx 2700 Q (+ 2700 A), But I had to import approx 30k Q/A, so I increased this time to 50,000 and for the safe side, I also added “set_time_limit(0);” in top of my import.php file.

Thanks Atul. Very helpful. I will probably shoot for 8000 max.

@Stephen, I just did a successful (i think) trial with “wp_set_current_user”, with almost all meta importing correctly. Unfortunately, I can’t post answer to my own question 🙁 I am uploading code to sample to git, and will post link shortly

Thats awesome. Thank you.

here is the gist link –
https://gist.github.com/atultiwari/cd9b5d00e54c238a02d1

i think i might write about pros/cons/pre-requisite/limitation/bugs of this piece of code in a separate post.. hard to follow in comments.

You are viewing 1 out of 3 answers, click here to view all answers.