Contributors

Re: Large Data Files

Hello Jerôme,

I had a similar issue some time ago with a huge nightly batch job. I also used pandas. In the end I created two dataframes and only synchronized the diff between legacy and Odoo. Pandas can process huge dataframes very fast. This reduced sync time from multiple hours to minutes.

Just sharing. Don't know if this is suitable for your problem.

BR Goran

20.08.2024 17:52:42 Tom Blauwendraat <notifications@odoo-community.org>:

There's create_multi also in some Odoo versions, but which one are you on?

On 8/20/24 17:32, Jerôme Dewandre wrote:

Hello,

I am currently working on a syncro with a legacy system (adesoft) containing a large amount of data that must be synchronized on a daily basis (such as meetings).

It seems everything starts getting slow when I import 30.000 records with the conventional "create()" method.

I suppose the ORM might be an issue here. Potential workaround:

1. Bypass the ORM to create a record with self.env.cr.execute (but if I want to delete them I will also need a custom query)

2. Bypass the ORM with stored procedures (https://www.postgresql.org/docs/current/sql-createprocedure.html)

3. Increase the CPU/RAM/Worker nodes

4. Some better ideas?

What would be the best way to go?

A piece of my current test (df is a pandas dataframe containing the new events):

@api.model

def create_events_from_df(self, df):

Event = self.env['event.event']

events_data = []

for _, row in df.iterrows():

event_data = {

'location': row['location'],

'name': row['name'],

'date_begin': row['date_begin'],

'date_end': row['date_end'],

}

events_data.append(event_data)

# Create all events in a single batch

Event.create(events_data)

Thanks in advance if you read this, and thanks again if you replied :)

Jérôme

_______________________________________________
Mailing-List: https://odoo-community.org/groups/contributors-15
Post to: mailto:contributors@odoo-community.org
Unsubscribe: https://odoo-community.org/groups?unsubscribe

_______________________________________________
Mailing-List: https://odoo-community.org/groups/contributors-15
Post to: mailto:contributors@odoo-community.org
Unsubscribe: https://odoo-community.org/groups?unsubscribe

by Goran Sunjka - 06:46 - 20 Aug 2024

Reference

Large Data Files

Hello,

I am currently working on a syncro with a legacy system (adesoft) containing a large amount of data that must be synchronized on a daily basis (such as meetings).

It seems everything starts getting slow when I import 30.000 records with the conventional "create()" method.

I suppose the ORM might be an issue here. Potential workaround:

1. Bypass the ORM to create a record with self.env.cr.execute (but if I want to delete them I will also need a custom query)
2. Bypass the ORM with stored procedures (https://www.postgresql.org/docs/current/sql-createprocedure.html)
3. Increase the CPU/RAM/Worker nodes
4. Some better ideas?

What would be the best way to go?

A piece of my current test (df is a pandas dataframe containing the new events):

@api.model
def create_events_from_df(self, df):
Event = self.env['event.event']
events_data = []
for _, row in df.iterrows():
event_data = {
'location': row['location'],
'name': row['name'],
'date_begin': row['date_begin'],
'date_end': row['date_end'],
}
events_data.append(event_data)

# Create all events in a single batch
Event.create(events_data)

Thanks in advance if you read this, and thanks again if you replied :)

Jérôme

by "Jerôme Dewandre" <jerome.dewandre.mail@gmail.com> - 05:31 - 20 Aug 2024

Archives

Contributors

Re: Large Data Files

Re: Large Data Files

Re: Large Data Files

Reference

Large Data Files

Follow us