cancel
Showing results for 
Search instead for 
Did you mean: 

how to increase the chunck size

alexgv12
9 - Travel Pro
9 - Travel Pro

I have a table that has 1454334099 records and the build downloads 100000 in one hour it downloads 100.000.000 how can I increase the chuck or what is the best alternative to decrease the build times?

3 REPLIES 3

HamzaJ
12 - Data Integration
12 - Data Integration

Hi @alexgv12 ,

You can change the chunk size by going to Admin  > Datasources > Cogwheel and change the chunksize (on Windows). 

In linux you have to go to Admin > System Management > Configuration > Elasticube build params .

I would advice you to confer with support on this one. Increasing chunksize does not necessarily lower build times. More rows per chunk means more rows need to be imported and processed at the same time. To increase build times you could check

  • general resource usage of the server during builds
  • free diskspace
  • resources usage of the source you are importing from
  • reading/writing speeds of data on the server
  • internet connection

I have a table of around 100.000.000 records which is loaded within 15 minutes. You could also investigate if you need to import all data every build. Perhaps its possible to only load changes/new data which would speed up the process

 

Hamza

our source is a table in databricks, if we notice some network peaks in the cluster, but we do not see other consumption peaks, currently in this table we only partition it by a column that we use in the querys, it would be good to add other partitions even if we do not use it in the query

alexgv12
9 - Travel Pro
9 - Travel Pro

thanks for your help, yes I found the option to increase the chuncks but something strange happens is that I don't see the change reflected in the builds, let's reinitialize the pods.