- Mailing Lists
- Contributors
- Re: Large Data Files
Archives
- By thread 1419
-
By date
- August 2019 59
- September 2019 118
- October 2019 165
- November 2019 97
- December 2019 35
- January 2020 58
- February 2020 204
- March 2020 121
- April 2020 172
- May 2020 50
- June 2020 158
- July 2020 85
- August 2020 94
- September 2020 193
- October 2020 277
- November 2020 100
- December 2020 159
- January 2021 38
- February 2021 87
- March 2021 146
- April 2021 73
- May 2021 90
- June 2021 86
- July 2021 123
- August 2021 50
- September 2021 68
- October 2021 66
- November 2021 74
- December 2021 75
- January 2022 98
- February 2022 77
- March 2022 68
- April 2022 31
- May 2022 59
- June 2022 87
- July 2022 141
- August 2022 38
- September 2022 73
- October 2022 152
- November 2022 39
- December 2022 50
- January 2023 93
- February 2023 49
- March 2023 106
- April 2023 47
- May 2023 69
- June 2023 92
- July 2023 64
- August 2023 103
- September 2023 91
- October 2023 101
- November 2023 94
- December 2023 46
- January 2024 75
- February 2024 79
- March 2024 104
- April 2024 63
- May 2024 40
- June 2024 160
- July 2024 80
- August 2024 70
- September 2024 62
- October 2024 121
- November 2024 117
- December 2024 89
- January 2025 59
- February 2025 104
- March 2025 96
- April 2025 107
- May 2025 52
- June 2025 72
- July 2025 60
- August 2025 81
- September 2025 124
- October 2025 63
- November 2025 22
Contributors
Re: Large Data Files
Hello Jerôme,
I had a similar issue some time ago with a huge nightly batch job. I also used pandas. In the end I created two dataframes and only synchronized the diff between legacy and Odoo. Pandas can process huge dataframes very fast. This reduced sync time from multiple hours to minutes.
Just sharing. Don't know if this is suitable for your problem.
BR Goran
I had a similar issue some time ago with a huge nightly batch job. I also used pandas. In the end I created two dataframes and only synchronized the diff between legacy and Odoo. Pandas can process huge dataframes very fast. This reduced sync time from multiple hours to minutes.
Just sharing. Don't know if this is suitable for your problem.
BR Goran
20.08.2024 17:52:42 Tom Blauwendraat <notifications@odoo-community.org>:
There's create_multi also in some Odoo versions, but which one are you on?
On 8/20/24 17:32, Jerôme Dewandre wrote:
Hello,
I am currently working on a syncro with a legacy system (adesoft) containing a large amount of data that must be synchronized on a daily basis (such as meetings).
It seems everything starts getting slow when I import 30.000 records with the conventional "create()" method.
I suppose the ORM might be an issue here. Potential workaround:
1. Bypass the ORM to create a record with self.env.cr.execute (but if I want to delete them I will also need a custom query)2. Bypass the ORM with stored procedures (https://www.postgresql.org/docs/current/sql-createprocedure.html)3. Increase the CPU/RAM/Worker nodes4. Some better ideas?
What would be the best way to go?
A piece of my current test (df is a pandas dataframe containing the new events):
@api.modeldef create_events_from_df(self, df):Event = self.env['event.event']events_data = []for _, row in df.iterrows():event_data = {'location': row['location'],'name': row['name'],'date_begin': row['date_begin'],'date_end': row['date_end'],}events_data.append(event_data)# Create all events in a single batchEvent.create(events_data)
Thanks in advance if you read this, and thanks again if you replied :)
Jérôme_______________________________________________
Mailing-List: https://odoo-community.org/groups/contributors-15
Post to: mailto:contributors@odoo-community.org
Unsubscribe: https://odoo-community.org/groups?unsubscribe_______________________________________________
Mailing-List: https://odoo-community.org/groups/contributors-15
Post to: mailto:contributors@odoo-community.org
Unsubscribe: https://odoo-community.org/groups?unsubscribe
by Goran Sunjka - 06:46 - 20 Aug 2024
Reference
-
Large Data Files
Hello,I am currently working on a syncro with a legacy system (adesoft) containing a large amount of data that must be synchronized on a daily basis (such as meetings).It seems everything starts getting slow when I import 30.000 records with the conventional "create()" method.I suppose the ORM might be an issue here. Potential workaround:1. Bypass the ORM to create a record with self.env.cr.execute (but if I want to delete them I will also need a custom query)2. Bypass the ORM with stored procedures (https://www.postgresql.org/docs/current/sql-createprocedure.html)3. Increase the CPU/RAM/Worker nodes4. Some better ideas?What would be the best way to go?A piece of my current test (df is a pandas dataframe containing the new events):@api.modeldef create_events_from_df(self, df):Event = self.env['event.event']events_data = []for _, row in df.iterrows():event_data = {'location': row['location'],'name': row['name'],'date_begin': row['date_begin'],'date_end': row['date_end'],}events_data.append(event_data)# Create all events in a single batchEvent.create(events_data)Thanks in advance if you read this, and thanks again if you replied :)Jérôme
by "Jerôme Dewandre" <jerome.dewandre.mail@gmail.com> - 05:31 - 20 Aug 2024