Varying results

I recently overtook a grown-up codebase that does frontend e2e testing with chrome webdriver and codecept.js. Results vary a lot between test-runs. sometime it shows 10 errors sometimes up to 20 (from 100). For example: An error that is reported when running whole testsuite is not reproducable when running this testcase alone.

I think this is bad situation. It does not feel stable/reliable. Is this common in frontend testing? How would i tackle such an issue? Is there specific points I have to keep in mind when trying to improve this behavior? Which documentation would you propose to get me started?