close
The Wayback Machine - https://web.archive.org/web/20220303092614/https://github.com/github/codeql/pull/4881
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPP: Add query for CWE-401 memory leak on unsuccessful call to realloc function #4881

Merged
merged 18 commits into from Jan 13, 2021
Merged

Conversation

ihsinme
Copy link
Contributor

@ihsinme ihsinme commented Dec 26, 2020

Good day.
This is my second request. I am grateful for @jbj's help. I tried to account for all the issues of the first request and indentation and tests.

This error is quite specific, many people think that if memory cannot be allocated, then the program will terminate. However, under conditions of non-criticality for the main program of allocated memory and strong segmentation, a memory leak may form.

In this PR, I consider only two exceptions, the processing of allocated memory with the termination of the main program through the call to exit and assert within the parent function. In the next iteration, I plan to add handling of these situations outside of the function.

Information about the found and accepted fix in the project: NLnetLabs/unbound#341

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Some initial thoughts:

  • this query looks promising, and is certainly capable of finding genuine issues that our existing memory queries (in particular cpp/memory-never-freed and cpp/memory-may-not-be-freed) do not. 👍
  • the tests are fairly thorough. 👍
  • I've made quite a few suggestions; feel free to pick and choose what you think is most important, and ask for help if you need it.

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 6, 2021

Thank you for such a large number of adjustments, I will try to fix them promptly.

ihsinme and others added 7 commits Jan 6, 2021
…dCallToRealloc.qhelp

Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…dCallToRealloc.qhelp

Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
…dCallToRealloc.ql

Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 7, 2021

good day.
I changed the code according to your comments.
I would like to draw your attention to the following.
1.you were absolutely correct regarding the assert call, I was confused by the codeql test run.
1.1. the options you suggested are good, but they give too many false positives. (forgotten return, empty functions, too many destructors)
1.2. so I suggest using your hint to define functions containing exit () in the isExistsIfWithExitCall () predicate, where it is more logical. (Of course, we will still have false positives, for example, when the nesting of functions before exit is large, or when the function calls exit for a reason that is not related to a null buffer. But it seems to me that this is a permissible error at this stage.)
2. I changed the assert call detection to the __assert_fail detection with the text regexp, I checked it on real projects.
2.1. I also could not complete your proposal for including #define assert, in terms of leaving the exit call in the function. this would lead to detection via the isExistsIfWithExitCall () predicate.
3. unfortunately I'm not sure that copying the changes did not hook / r / n and so far I am not able to check it in PR.

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 12, 2021

Hi, thanks for the changes and thank you for your contribution. Results look promising, it certainly finds some issues! I think there are still some improvements to be made, but I intend merge this (once the checks pass) and then try some tweaks myself. Maybe get that false positive rate down a bit more.

👍

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 12, 2021

please tell me.
Can I open an "All For One, One For All" objection request now, or do I still need to wait for the results of your evaluations.

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 13, 2021

We think it's probably a good idea to make the application now with the query in its current state, as my follow-up changes won't factor into the evaluation anyway.

The checks have passed so I'm merging this now.

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Approved for merging.

@geoffw0 geoffw0 merged commit 6966453 into github:main Jan 13, 2021
8 checks passed
@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 13, 2021

@kevinbackhouse
Copy link
Contributor

@kevinbackhouse kevinbackhouse commented Jan 19, 2021

I have two suggestions:

  1. It looks like this query is intended to ignore cases where a failed allocation is immediately followed by an exit or abort, but it is not working very well. Click here for an example. It looks like the query is using line numbers, which is not very reliable. It would be better to use post-dominators, which we have a library for.
  2. I think it is currently matching cases like s->p = realloc (s->p, n) kind of by accident. I think the current logic would also trigger for s->p = realloc (t->p, n), which would be incorrect. I would suggest trying the HashCons library.

@kevinbackhouse
Copy link
Contributor

@kevinbackhouse kevinbackhouse commented Jan 19, 2021

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 19, 2021

I already have a solution to issue 1, I will make a PR for it now so that we don't end up with merge issues. I was not aware of issue 2 so that's something you could look into.

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 19, 2021

thanks for your comment.
I agree that abort should be added.
I've only considered the exit function.

regarding problem 2, I found such calls to be extremely rare (at least on my test set of projects).

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 19, 2021

My suggested changes for 1: #4979

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 19, 2021

I don't really understand my role in said discussion. so I immediately apologize if I write something superfluous.

regarding @geoffw0's suggested change.
as I wrote above, this approach gives too much false detection. thus it is defined (functions in which they forgot to put rethurn, destructors, empty functions). I also wanted to draw your attention to the fact that my idea was to find such realloc calls immediately after which the programmer worked to stop the program. As far as I understand, your fix excludes in general all cases in which after realloc it is possible to exit the program, even not associated with realloc itself.

I once again apologize if my words were not useful here.

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 19, 2021

in more detail regarding problem 2.

as I wrote above, I did not set myself the goal of tracking these situations, if you consider them critical, I will certainly supplement the detection with this check. but I want to draw your attention that there are several types of vulnerable use of realloc that will not be detected by my request. at the same time, I consider them quite rare. at present, when such a mistake is widely presented, I would not want to complicate the query. as an example I can give the expression a = b = realloc (a, n) which is also not detected.

but I do not have enough experience to say something, so I ask you to tell me more specifically what I should do in this situation.

Thank you.

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 20, 2021

functions in which they forgot to put rethurn, destructors, empty functions

I don't think these are special cases, they are all functions which return.

As far as I understand, your fix excludes in general all cases in which after realloc it is possible to exit the program, even not associated with realloc itself.

Yes, it looks for any program termination in the same function as the realloc that can be reached after it. We could be more accurate in this regard. In general though, we find that people respond better to queries which produce a smaller number of high quality results than a longer list that includes a lot of false positives - so we prefer (at first) to minimize the number of false positives even if it costs a few true positive results.

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 20, 2021

I want you to understand me correctly, I'm new to your system. plus not a native speaker. so if my words seem stupid, then I apologize for the time spent on them.

I don't think these are special cases, they are all functions which return.

maybe I'm wrong, in this matter I need time. I will try to make a test demonstrating this problem. but for example, in the dovecot project, under your CallMayNotReturn class, almost all functions for working with memory fall into

Yes, it looks for any program termination in the same function as the realloc that can be reached after it. We could be more accurate in this regard. In general though, we find that people respond better to queries which produce a smaller number of high quality results than a longer list that includes a lot of false positives - so we prefer (at first) to minimize the number of false positives even if it costs a few true positive results.

this is very reasonable, but only for the main (base) set of queries.
in my opinion, experimental queries are used by those developers who want to see more about their code.
in this case, it is probably better to show unnecessary than not to show an error.

at the same time, I would like to clarify my next steps with this request. can he still qualify for the reward or the above errors and your change will revoke this right? What other actions would be helpful on my part?

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 20, 2021

for example, in the dovecot project, under your CallMayNotReturn class, almost all functions for working with memory fall into

I see what you mean, p_malloc contains a call to i_panic (which seems to call abort or exit) in the case of an allocation failure, but it's very different to the conditional termination of, say, an assert. I'll look into reintroducing some more of your constraints (but we're not very keen on keeping getStartLine and toString as in your original solution, both are problematic).

can he still qualify for the reward or the above errors and your change will revoke this right?

I'm not familiar with the process, but my understanding is that you will qualify based on the version before my change, and possibly contributions you make afterwards as well (but I'm not sure on the bit).

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 20, 2021

Another improvement which I believe addresses the above: #4990 - @ihsinme , I'd appreciate your thoughts on that pull request.

@ihsinme
Copy link
Contributor Author

@ihsinme ihsinme commented Jan 20, 2021

I wrote my vision in PR, I remind you that I could be wrong.
I hope this will be useful to you

at the same time I would like to ask you to prompt me for a similar script to estimate the running time of a predicate.

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Jan 21, 2021

at the same time I would like to ask you to prompt me for a similar script to estimate the running time of a predicate.

I usually run queries locally when I want to examine running times and performance. In CodeQL for Visual Studio Code you can see run times and access logs from the query history window. This can also be done with command line tools.

However query optimization is a complicated and often counterintuitive subject, and the QL optimizer usually does an excellent job automatically. Do you have any concerns for this query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants