Thanks for splitting, I agree.
Thanks for the suggestion, I'll raise the memory limit in in a dedicated prototype for Numpy or Pandas.
I confirm it is Numpy and not Pandas itself gobbling up all the memory, I used an adapted version of your script and the results are :
Physical memory (RSS): 8.94 MB Virtual memory (VMS): 16.69 MB after immport numpy Physical memory (RSS): 33.32 MB Virtual memory (VMS): 1036.91 MB after import pandas Physical memory (RSS): 65.62 MB Virtual memory (VMS): 1144.77 MB
I indeed have 8 cores (vCPUs on a VM). Then trying with
import os os.environ["OMP_NUM_THREADS"] = "4"
before calling the get_memory_usage_linux() yields:
Physical memory (RSS): 8.85 MB Virtual memory (VMS): 16.69 MB after immport numpy Physical memory (RSS): 31.12 MB Virtual memory (VMS): 493.02 MB after import pandas Physical memory (RSS): 65.72 MB Virtual memory (VMS): 568.85 MB
The server is a Debian 12.8 (6.1.0-28-amd64), whereas the jobeinabox Docker (which has a more "normal" behaviour) uses Ubuntu ...