-
Notifications
You must be signed in to change notification settings - Fork 55
add error number to file open error messages #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add error number to file open error messages #260
Conversation
### Description of work The recent problems with running `pystack` in the error reporter have been traced back to the fact that `pystack` was trying to open more file descriptors than the limit set on most linux systems. A pr into `pystack` has been made to improve their error messages (bloomberg/pystack#260). When this is a problem, you get a lot of this kind of output ``` ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore. ``` The issue started after the first new instrument view pr went in (this commit 8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34). We're not that sure why, but it could be the addition of new libraries such as `pyvista`. This pr raises the limit (equivalent to `ulimit -n <value>`) for the outer mantid process, which may eventually launch the error reporter. ### To test: Package to test is at `mamba install jhaigh0/label/raise_open_files_limit::mantidworkbench` It would be good to test on idaaas where the core dump location is already set up. Run Segfault and check that the pystack output has been captured in the stacktrace field.
…idproject#40108) ### Description of work The recent problems with running `pystack` in the error reporter have been traced back to the fact that `pystack` was trying to open more file descriptors than the limit set on most linux systems. A pr into `pystack` has been made to improve their error messages (bloomberg/pystack#260). When this is a problem, you get a lot of this kind of output ``` ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore. ``` The issue started after the first new instrument view pr went in (this commit mantidproject@8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34). We're not that sure why, but it could be the addition of new libraries such as `pyvista`. This pr raises the limit (equivalent to `ulimit -n <value>`) for the outer mantid process, which may eventually launch the error reporter. ### To test: Package to test is at `mamba install jhaigh0/label/raise_open_files_limit::mantidworkbench` It would be good to test on idaaas where the core dump location is already set up. Run Segfault and check that the pystack output has been captured in the stacktrace field.
…idproject#40108) ### Description of work The recent problems with running `pystack` in the error reporter have been traced back to the fact that `pystack` was trying to open more file descriptors than the limit set on most linux systems. A pr into `pystack` has been made to improve their error messages (bloomberg/pystack#260). When this is a problem, you get a lot of this kind of output ``` ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore. ``` The issue started after the first new instrument view pr went in (this commit mantidproject@8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34). We're not that sure why, but it could be the addition of new libraries such as `pyvista`. This pr raises the limit (equivalent to `ulimit -n <value>`) for the outer mantid process, which may eventually launch the error reporter. ### To test: Package to test is at `mamba install jhaigh0/label/raise_open_files_limit::mantidworkbench` It would be good to test on idaaas where the core dump location is already set up. Run Segfault and check that the pystack output has been captured in the stacktrace field.
b42042f to
76370c3
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #260 +/- ##
=======================================
Coverage 83.07% 83.07%
=======================================
Files 46 46
Lines 6211 6211
Branches 470 470
=======================================
Hits 5160 5160
Misses 1051 1051
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
godlygeek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for the contribution!
Issue number of the reported bug or feature request: #
Describe your changes
As per #233, I have added more detail to error message when file open has failed in several locations.
I have had to debug an issue in https://github.com/mantidproject/mantid where core dump analysis was failing due to many
ERROR(process_core): Cannot open ELF file. The ultimate cause was the maximum number of open file descriptors in the process being exceeded. Having this error message initially would have made identifying the issue trivial.Example old output:
ERROR(process_core): Cannot open ELF file /lib/x86_64-linux-gnu/libpthread.so.0New output:
ERROR(process_core): Cannot open ELF file /lib/x86_64-linux-gnu/libpthread.so.0 (Too many open files)Testing performed
Have run
pystack core <core_dump_file>and observed new errors.