Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Patterns] Test discovery rules for PHP #9

Open
compaluca opened this issue Feb 2, 2023 · 9 comments
Open

[Patterns] Test discovery rules for PHP #9

compaluca opened this issue Feb 2, 2023 · 9 comments
Assignees

Comments

@compaluca
Copy link
Contributor

Problem statement

Discovery rules (when avaliable) were tested for PHP patterns via the tpframework new functionality:

tpframework checkdiscoveryrules --export test_export.csv -l PHP -a --output-dir /tp-framework/out/20230201_check_discovery_rules-PHP 

Here a summary of the results:

  PHP - 02022023
counting 141
successful 102
unsuccessful 12
error 27

This means that:

  • for 12 pattern instances the discovery rule was run, but the pattern instance was not discovered. Thus the discovery rule must have some mistakes
  • for 27 cases an error occurs either at the pattern level or at the pattern instance level. Maybe the discovery rule is not there or it is not a scala rule, or the pattern does not fullfil the file system structure

Here the raw data in a zip comprising:

Proposed changes

Check the issues reported and fix the patterns that can be fixed

  • @mal-tee : can you please check the patterns you have already reviewed? Is there anyone that needs to be updated?
  • @compaluca : I will check some of the others
@mal-tee
Copy link
Contributor

mal-tee commented Feb 2, 2023

Some of my already review patterns have an error.

  • lots of object of type 'int' has no len(). I can't make anything of this yet.
  • one "discovery method python is not supported", but we want to get rid of this either way if it's possible

@compaluca
Copy link
Contributor Author

@mal-tee : the logger can be extended with some more info and I will try to do so. However that specific error: object of type 'int' has no len() seems to be caused by the parsing of the discovery rule results. Rather than a list of findings we only get 1 as result (see last element of the raw output in the log). Does that indicate a failure in executing the discovery rule?

2023-02-01 16:57:41 - tpframework.core.discovery - INFO - run_discovery_rule - Discovery - rule raw output: (/tp-framework/out/20230201_check_discovery_rules-PHP/check_discovery_rules_2023-02-01-16-57-11_PHP_1_instance_37_callstatic_overloading/cpg_2023-02-01-16-57-11_PHP_1_instance_37_callstatic_overloading.bin,37_callstatic_overloading_iall,1)

You can try to run yourself on the branch refactoring (here done for pattern 37 but you can change it):
tpframework checkdiscoveryrules --export test_export.csv -l PHP -p 37 --output-dir /tp-framework/out/20230201_check_discovery_rules-PHP

You will get a sub-folder within the results folder that will contain the generated cpg in case you want to run it manually.

@mal-tee
Copy link
Contributor

mal-tee commented Feb 3, 2023

Seems like some discovery rules end with .size and therefore return a int. I'll change them so that they end with .toJson and can be properly parsed by the framework.

@mal-tee
Copy link
Contributor

mal-tee commented Feb 3, 2023

Hmm, some nodes in the php cpg don't have a lineNumber. How can we deal with this?
What I was usually doing outside of this framework is going up the ast-parent chain, until I encounter a lineNumber. I did this inside scala code and not the joern shell.

@pr0me what's a smart way to fix this here? My current solution is to fix this via

.repeat(_.astParent)(_.until(_.filter(x => x.lineNumber.getOrElse(-1) != -1))).location.toJson
// sometimes the generator might assign Some(-1) for unknown numbers. That's why I use the .getOrElse(-1) != -1 construct, to cover None and Some(-1).

instead of

.location.toJson

Is there a smarter way? This would make the discovery rules quite bulky.

@SoheilKhodayari
Copy link
Member

FYI, When working on issue #10, I am facing the same problem for pattern 76 in JS. The query is as below and joern is not returning the line number.

 cpg.assignment.where(n=> n.code("^(?!(function|_tmp)).*\\..*=.*$")).method.callIn.location.toJson

If anyone has some hints on why that happens, it is really appreciated!

@mal-tee I tried your solution of traversing the AST upwards (as above), and it seem to not work in my case, returning empty results like below. Have you also encountered this issue?

{'symbol': '<empty>', 'packageName': '<empty>', 'nodeLabel': '', 'methodShortName': '<empty>', 'methodFullName': '<empty>', 'filename': '<empty>', 'classShortName': '<empty>', 'className': '<empty>'}

@mal-tee
Copy link
Contributor

mal-tee commented Feb 3, 2023

Interesting!

@mal-tee I tried your solution of traversing the AST upwards (as above), and it seem to not work in my case, returning empty results like below. Have you also encountered this issue?

{'symbol': '<empty>', 'packageName': '<empty>', 'nodeLabel': '', 'methodShortName': '<empty>', 'methodFullName': '<empty>', 'filename': '<empty>', 'classShortName': '<empty>', 'className': '<empty>'}

No, haven't seen that so far.

@mal-tee
Copy link
Contributor

mal-tee commented Feb 9, 2023

@mal-tee
Copy link
Contributor

mal-tee commented Feb 9, 2023

Hmm, some nodes in the php cpg don't have a lineNumber. How can we deal with this? What I was usually doing outside of this framework is going up the ast-parent chain, until I encounter a lineNumber. I did this inside scala code and not the joern shell.

@pr0me what's a smart way to fix this here? My current solution is to fix this via

.repeat(_.astParent)(_.until(_.filter(x => x.lineNumber.getOrElse(-1) != -1))).location.toJson
// sometimes the generator might assign Some(-1) for unknown numbers. That's why I use the .getOrElse(-1) != -1 construct, to cover None and Some(-1).

instead of

.location.toJson

Is there a smarter way? This would make the discovery rules quite bulky.

@compaluca As discussed in our last meeting, I manually tried this workaround on a few instances (58, 19_1, 56,3_1,and 3_2). The fix didn't change the outcome of the discovery rules for the instances that were ok beforehand (58, 19_1). But it also fixed 56, 3_1, and 3_2 which had missing linenos before.

Speaking from this data integrating this into the framework should be ok until we fix linenumbers in the php-cpg.

@compaluca
Copy link
Contributor Author

After implementing the work-around in the framework, we will repeat the test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants