Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hadd issue when using parallelization together with indirect file #14910

Closed
1 task done
Tracked by #14319
mlizzo opened this issue Mar 7, 2024 · 10 comments · Fixed by #14913
Closed
1 task done
Tracked by #14319

hadd issue when using parallelization together with indirect file #14910

mlizzo opened this issue Mar 7, 2024 · 10 comments · Fixed by #14913
Assignees
Labels

Comments

@mlizzo
Copy link

mlizzo commented Mar 7, 2024

Check duplicate issues.

  • Checked for duplicates

Description

Dear ROOT experts,

It seems that when running the hadd command, the code parallelization is not triggered (e.g. -j 10) if target files are stored in a .txt file ("indirect file")

Reproducer

Input files and .txt file are stored here, I am sorry but I wasn't able to upload root files to git. The issue can be reproduced by running:
hadd -j 10 mkShapes__RDF_2018_v9_emu_ttHMVA.root @doHadd.txt
The initial output says:

Parallelizing with 10 processes
hadd Target file: mkShapes__RDF_2018_v9_emu_ttHMVA.root
hadd compression setting for all output: 1
Each process should handle at least 3 files for efficiency. Setting the number of processes to: 1

The last line is not expected when going to parallel mode, it looks like the script was expecting more arguments from the command line, but target files are stored in the doHadd.txt file

ROOT version

   ------------------------------------------------------------------
  | Welcome to ROOT 6.28/00                        https://root.cern |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Feb 03 2023, 14:50:41                 |
  | From tags/v6-28-00@v6-28-00                                      |
  | With g++ (GCC) 11.3.0                                            |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

Installation method

/cvmfs/sft.cern.ch/lcg/views/LCG_103/x86_64-centos9-gcc11-opt/bin/root

Operating system

Linux

Additional context

No response

@ferdymercury
Copy link
Collaborator

Feel free to try this patch: #14913

@mlizzo
Copy link
Author

mlizzo commented Mar 8, 2024

Hi @ferdymercury , thank you for your prompt answer. Maybe I'm doing something wrong, but I wasn't able to get the target file. The patch seems to correctly trigger the parallelization execution:

Parallelizing with 10 processes. hadd Target file: /afs/cern.ch/work/m/mlizzo/mkShapesRDF_el9/PlotsConfigurationsRun3/VBS_OS_pol/ttHMVA_Full2018_v9/rootFiles__RDF_2018_v9_emu_ttHMVA/mkShapes__RDF_2018_v9_emu_ttHMVA.root hadd compression setting for all output: 101 // compression level is different, don't know why

but then I get:

hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/VBS_2j_em
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/VBS_2j_em/events
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/VBS_2j_em/ttHMVASF_LepWP
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/top_2j_em
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/top_2j_em/events
hadd Target path: /tmp/mlizzo/partial0_088bf922-dd2d-11ee-b92d-3f4eb9bcbeef.root:/top_2j_em/ttHMVASF_LepWP
hadd failed at the parallel stage

Were you able to properly merge root files? I compiled the hadd.cxx macro with the following command:
c++ hadd.cxx -o hadd $(root-config --cflags --libs)

@ferdymercury
Copy link
Collaborator

wrt compression level, did you check the flags to set the compression?
see:
hadd --help

@mlizzo
Copy link
Author

mlizzo commented Mar 8, 2024

Unfortunately I had to comment out #include haddCommandLineOptionsHelp.h and the corresponding print, I don't know why that header could not be found - the "standard" hadd executable is the one installed in lxplus (CERN) and to be honest I have no idea how that is compiled. In any case, I'm using default options, I just noticed that it changed (101) with respect to what was set before (1), can this play a role? Did you manage to merge input files eventually? Thank you

@ferdymercury
Copy link
Collaborator

ferdymercury commented Mar 8, 2024

If you do not specify anything, you will get:

/// Use the compile-time default setting
         kUseCompiledDefault = 101,

Here the full help:

usage: hadd [-a] [-f] [-f[0-9]] [-fk] [-ff] [-k] [-O] [-T] [-v V] [-j J]
            [-dbg] [-d D] [-n N] [-cachesize CACHESIZE]
            [-experimental-io-features EXPERIMENTAL_IO_FEATURES]
            TARGET SOURCES

This program will add histograms, trees and other objects from a list
of ROOT files and write them to a target ROOT file. The target file is
newly created and must not exist, or if -f ("force") is given, must
not be one of the source files.

OPTIONS:
  -a                                   Append to the output
  -f                                   Force overwriting of output file
  -f[0-9]                              Gives the ability to specify the compression level of the target file.
                                       Default is 1 (kDefaultZLIB), 0 is uncompressed, 9 is maximum
                                       compression (see TFile::TFile documentation). You can also specify the
                                       full compresion algorithm, e.g. -f206
  -fk                                  Sets the target file to contain the baskets with the same compression
                                       as the input files (unless -O is specified). Compresses the meta data
                                       using the compression level specified in the first input or the
                                       compression setting after fk (for example 206 when using -fk206)
  -ff                                  The compression level use is the one specified in the first input
  -k                                   Skip corrupt or non-existent files, do not exit
  -O                                   Re-optimize basket size when merging TTree
  -T                                   Do not merge Trees
  -v                                   Explicitly set the verbosity level: 0 request no output, 99 is the
                                       default
  -j                                   Parallelize the execution in 'J' processes. If the number of processes
                                       is not specified, use the system maximum.
  -dbg                                 Enable verbosity. If -j was specified, do not not delete partial files
                                       stored inside working directory.
  -d                                   Carry out the partial multiprocess execution in the specified
                                       directory
  -n                                   Open at most 'N' files at once (use 0 to request to use the system
                                       maximum)
  -cachesize                           Resize the prefetching cache use to speed up I/O operations (use 0 to
                                       disable)
  -experimental-io-features            Used with an argument provided, enables the corresponding experimental
                                       feature for output trees. See ROOT::Experimental::EIOFeatures
  TARGET                               Target file
  SOURCES                              Source files

If TARGET and SOURCES have different compression settings a slower
method is used. For options that takes a size as argument, a decimal
number of bytes is expected. If the number ends with a ``k'', ``m'',
``g'', etc., the number is multiplied by 1000 (1K), 1000000 (1MB),
1000000000 (1G), etc. If this prefix is followed by i, the number is
multiplied by the traditional 1024 (1KiB), 1048576 (1MiB), 1073741824
(1GiB), etc. The prefix can be optionally followed by B whose casing
is ignored, eg. 1k, 1K, 1Kb and 1KB are the same.

No, I also get a "failed" error.

@mlizzo
Copy link
Author

mlizzo commented Mar 8, 2024

Hi @ferdymercury , I think I have found what is causing the error. I believe that here the argc argument must be replaced with filesToProcess when an indirect file is present. I made a quick test and it worked fine, but maybe it's better if you experts could take a closer look into that

@ferdymercury
Copy link
Collaborator

I just edited the patch, try again ;)

@mlizzo
Copy link
Author

mlizzo commented Mar 8, 2024

It seems to work properly now, thanks! For which ROOT versions will this fix be available eventually? I was wondering if this modification could be pushed to the LCG release in /cvmfs/sft.cern.ch as well, but I don't know if this is feasible

@ferdymercury
Copy link
Collaborator

I guess that's a question for @dpiparo or @vepadulano

@vepadulano
Copy link
Member

As soon as the PR with the patch is merged in master, you will find it in the LCG release of the following day (i.e. at /cvmfs/sft.cern.ch/lcg/views/dev3/...). Afterwards, you will find it in the next ROOT release 6.32 (a couple months from now).

ferdymercury added a commit to ferdymercury/root that referenced this issue Apr 6, 2024
ferdymercury added a commit to ferdymercury/root that referenced this issue Apr 9, 2024
vepadulano added a commit that referenced this issue Apr 10, 2024
…14913)

Fixes #14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
vepadulano added a commit to vepadulano/root that referenced this issue Apr 10, 2024
…oot-project#14913)

Fixes root-project#14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
lobis pushed a commit to lobis/root that referenced this issue Apr 10, 2024
…oot-project#14913)

Fixes root-project#14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
vepadulano added a commit that referenced this issue Apr 10, 2024
…14913)

Fixes #14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
kristupaspranc pushed a commit to kristupaspranc/root that referenced this issue May 21, 2024
…oot-project#14913)

Fixes root-project#14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
silverweed pushed a commit to silverweed/root that referenced this issue Aug 19, 2024
…oot-project#14913)

Fixes root-project#14910

The full list of input files to merge is now created beforehand by accumulating both the direct and indirect ones (e.g. those found listed in file(s) signaled by the `@` syntax at the command line). Afterwards, the list is used either in the sequential or parallel merge depending on user input.

---------

Co-authored-by: Philippe Canal <pcanal@fnal.gov>
Co-authored-by: Vincenzo Eduardo Padulano <vincenzo.eduardo.padulano@cern.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants