Go to file
David Malcolm 6cf276ddf2 diagnostics: add SARIF output format
This patch adds support to gcc's diagnostic subsystem for emitting
diagnostics in SARIF, aka the Static Analysis Results Interchange Format:
  https://sarifweb.azurewebsites.net/
by extending -fdiagnostics-format= to add two new options:
  -fdiagnostics-format=sarif-stderr
and:
  -fdiagnostics-format=sarif-file

The patch targets SARIF v2.1.0

This is a JSON-based format suited for capturing the results of static
analysis tools (like GCC's -fanalyzer), but it can also be used for plain
GCC warnings and errors.

SARIF supports per-event metadata in diagnostic paths such as
["acquire", "resource"] and ["release", "lock"] (specifically, the
threadFlowLocation "kinds" property: SARIF v2.1.0 section 3.38.8), so
the patch extends GCC"s diagnostic_event subclass with a "struct meaning"
with similar purpose.  The patch implements this for -fanalyzer so that
the various state-machine-based warnings set these in the SARIF output.

The heart of the implementation is in the new file
diagnostic-format-sarif.cc.  Much of the rest of the patch is interface
classes, isolating the diagnostic subsystem (which has no knowledge of
e.g. tree or langhook) from the "client" code in the compiler proper
cc1 etc).

The patch adds a langhook for specifying the SARIF v2.1.0
"artifact.sourceLanguage" property, based on the list in
SARIF v2.1.0 Appendix J.

The patch adds automated DejaGnu tests to our testsuite via new
scan-sarif-file and scan-sarif-file-not directives (although these
merely use regexps, rather than attempting to use a proper JSON parser).

I've tested the patch by hand using the validator at:
  https://sarifweb.azurewebsites.net/Validation
and the react-based viewer at:
  https://microsoft.github.io/sarif-web-component/
which successfully shows most of the information (although not paths,
and not CWE IDs), and I've fixed all validation errors I've seen (though
bugs no doubt remain).

I've also tested the generated SARIF using the VS Code extension linked
to from the SARIF website; I'm a novice with VS Code, but it seems to be
able to handle my generated SARIF files (e.g. showing the data in the
SARIF tab, and showing squiggly underlines under issues, and when I
click on them, it visualizes the events in the path inline within the
source window).

Has anyone written an Emacs mode for SARIF files? (pretty please)

gcc/ChangeLog:
	* Makefile.in (OBJS): Add tree-diagnostic-client-data-hooks.o and
	tree-logical-location.o.
	(OBJS-libcommon): Add diagnostic-format-sarif.o; reorder.
	(CFLAGS-tree-diagnostic-client-data-hooks.o): Add TARGET_NAME.
	* common.opt (fdiagnostics-format=): Add sarif-stderr and sarif-file.
	(sarif-stderr, sarif-file): New enum values.
	* diagnostic-client-data-hooks.h: New file.
	* diagnostic-format-sarif.cc: New file.
	* diagnostic-path.h (enum diagnostic_event::verb): New enum.
	(enum diagnostic_event::noun): New enum.
	(enum diagnostic_event::property): New enum.
	(struct diagnostic_event::meaning): New struct.
	(diagnostic_event::get_logical_location): New vfunc.
	(diagnostic_event::get_meaning): New vfunc.
	(simple_diagnostic_event::get_logical_location): New vfunc impl.
	(simple_diagnostic_event::get_meaning): New vfunc impl.
	* diagnostic.cc: Include "diagnostic-client-data-hooks.h".
	(diagnostic_initialize): Initialize m_client_data_hooks.
	(diagnostic_finish): Clean up m_client_data_hooks.
	(diagnostic_event::meaning::dump_to_pp): New.
	(diagnostic_event::meaning::maybe_get_verb_str): New.
	(diagnostic_event::meaning::maybe_get_noun_str): New.
	(diagnostic_event::meaning::maybe_get_property_str): New.
	(get_cwe_url): Make non-static.
	(diagnostic_output_format_init): Handle
	DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
	DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
	* diagnostic.h (enum diagnostics_output_format): Add
	DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
	DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
	(class diagnostic_client_data_hooks): New forward decl.
	(class logical_location): New forward decl.
	(diagnostic_context::m_client_data_hooks): New field.
	(diagnostic_output_format_init_sarif_stderr): New decl.
	(diagnostic_output_format_init_sarif_file): New decl.
	(get_cwe_url): New decl.
	* doc/invoke.texi (-fdiagnostics-format=): Add sarif-stderr and
	sarif-file.
	* doc/sourcebuild.texi (Scan a particular file): Add
	scan-sarif-file and scan-sarif-file-not.
	* langhooks-def.h (lhd_get_sarif_source_language): New decl.
	(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): New macro.
	(LANG_HOOKS_INITIALIZER): Add
	LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE.
	* langhooks.cc (lhd_get_sarif_source_language): New.
	* langhooks.h (lang_hooks::get_sarif_source_language): New field.
	* logical-location.h: New file.
	* plugin.cc (struct for_each_plugin_closure): New.
	(for_each_plugin_cb): New.
	(for_each_plugin): New.
	* plugin.h (for_each_plugin): New decl.
	* tree-diagnostic-client-data-hooks.cc: New file.
	* tree-diagnostic.cc: Include "diagnostic-client-data-hooks.h".
	(tree_diagnostics_defaults): Populate m_client_data_hooks.
	* tree-logical-location.cc: New file.
	* tree-logical-location.h: New file.

gcc/ada/ChangeLog:
	* gcc-interface/misc.cc (gnat_get_sarif_source_language): New.
	(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.

gcc/analyzer/ChangeLog:
	* checker-path.cc (checker_event::get_meaning): New.
	(function_entry_event::get_meaning): New.
	(state_change_event::get_desc): Add dump of meaning of the event
	to the -fanalyzer-verbose-state-changes output.
	(state_change_event::get_meaning): New.
	(cfg_edge_event::get_meaning): New.
	(call_event::get_meaning): New.
	(return_event::get_meaning): New.
	(start_consolidated_cfg_edges_event::get_meaning): New.
	(warning_event::get_meaning): New.
	* checker-path.h: Include "tree-logical-location.h".
	(checker_event::checker_event): Construct m_logical_loc.
	(checker_event::get_logical_location): New.
	(checker_event::get_meaning): New decl.
	(checker_event::m_logical_loc): New.
	(function_entry_event::get_meaning): New decl.
	(state_change_event::get_meaning): New decl.
	(cfg_edge_event::get_meaning): New decl.
	(call_event::get_meaning): New decl.
	(return_event::get_meaning): New decl.
	(start_consolidated_cfg_edges_event::get_meaning): New.
	(warning_event::get_meaning): New decl.
	* pending-diagnostic.h: Include "diagnostic-path.h".
	(pending_diagnostic::get_meaning_for_state_change): New vfunc.
	* sm-file.cc (file_diagnostic::get_meaning_for_state_change): New
	vfunc impl.
	* sm-malloc.cc (malloc_diagnostic::get_meaning_for_state_change):
	Likewise.
	* sm-sensitive.cc
	(exposure_through_output_file::get_meaning_for_state_change):
	Likewise.
	* sm-taint.cc (taint_diagnostic::get_meaning_for_state_change):
	Likewise.
	* varargs.cc
	(va_list_sm_diagnostic::get_meaning_for_state_change): Likewise.

gcc/c/ChangeLog:
	* c-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
	(c_get_sarif_source_language): New.
	* c-tree.h (c_get_sarif_source_language): New decl.

gcc/cp/ChangeLog:
	* cp-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
	(cp_get_sarif_source_language): New.

gcc/d/ChangeLog:
	* d-lang.cc (d_get_sarif_source_language): New.
	(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.

gcc/fortran/ChangeLog:
	* f95-lang.cc (gfc_get_sarif_source_language): New.
	(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.

gcc/go/ChangeLog:
	* go-lang.cc (go_get_sarif_source_language): New.
	(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.

gcc/objc/ChangeLog:
	* objc-act.h (objc_get_sarif_source_language): New decl.
	* objc-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
	(objc_get_sarif_source_language): New.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-sarif-file-1.c: New test.
	* c-c++-common/diagnostic-format-sarif-file-2.c: New test.
	* c-c++-common/diagnostic-format-sarif-file-3.c: New test.
	* c-c++-common/diagnostic-format-sarif-file-4.c: New test.
	* gcc.dg/analyzer/file-meaning-1.c: New test.
	* gcc.dg/analyzer/malloc-meaning-1.c: New test.
	* gcc.dg/analyzer/malloc-sarif-1.c: New test.
	* gcc.dg/plugin/analyzer_gil_plugin.c
	(gil_diagnostic::get_meaning_for_state_change): New vfunc impl.
	* gcc.dg/plugin/diagnostic-test-paths-5.c: New test.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic-test-paths-5.c to tests for
	diagnostic_plugin_test_paths.c.
	* lib/gcc-dg.exp: Load scansarif.exp.
	* lib/scansarif.exp: New test.

libatomic/ChangeLog:
	* testsuite/lib/libatomic.exp: Add load_gcc_lib of scansarif.exp.

libgomp/ChangeLog:
	* testsuite/lib/libgomp.exp: Add load_gcc_lib of scansarif.exp.

libitm/ChangeLog:
	* testsuite/lib/libitm.exp: Add load_gcc_lib of scansarif.exp.

libphobos/ChangeLog:
	* testsuite/lib/libphobos-dg.exp: Add load_gcc_lib of scansarif.exp.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-06-02 15:40:22 -04:00
c++tools Daily bump. 2022-03-19 00:16:22 +00:00
config Daily bump. 2022-06-02 00:16:32 +00:00
contrib Daily bump. 2022-05-28 00:16:40 +00:00
fixincludes Daily bump. 2022-02-28 00:16:17 +00:00
gcc diagnostics: add SARIF output format 2022-06-02 15:40:22 -04:00
gnattools Daily bump. 2021-10-23 00:16:26 +00:00
gotools Daily bump. 2022-02-14 00:16:23 +00:00
include Daily bump. 2022-06-01 00:16:34 +00:00
INSTALL
intl Daily bump. 2021-11-30 00:16:44 +00:00
libada Update copyright years. 2022-01-03 10:42:10 +01:00
libatomic diagnostics: add SARIF output format 2022-06-02 15:40:22 -04:00
libbacktrace Daily bump. 2022-05-29 00:16:31 +00:00
libcc1 Daily bump. 2022-06-02 00:16:32 +00:00
libcody Daily bump. 2022-03-19 00:16:22 +00:00
libcpp Daily bump. 2022-05-30 00:16:21 +00:00
libdecnumber Daily bump. 2022-05-21 00:16:32 +00:00
libffi Daily bump. 2021-11-16 00:16:31 +00:00
libgcc Daily bump. 2022-06-02 00:16:32 +00:00
libgfortran Daily bump. 2022-01-27 00:16:29 +00:00
libgo runtime: use correct field name for PPC32 GLIBC registers 2022-04-20 17:49:44 -07:00
libgomp diagnostics: add SARIF output format 2022-06-02 15:40:22 -04:00
libiberty Daily bump. 2022-05-24 00:17:03 +00:00
libitm diagnostics: add SARIF output format 2022-06-02 15:40:22 -04:00
libobjc Update copyright years. 2022-01-03 10:42:10 +01:00
liboffloadmic Daily bump. 2021-10-20 00:16:43 +00:00
libphobos diagnostics: add SARIF output format 2022-06-02 15:40:22 -04:00
libquadmath Daily bump. 2022-01-12 00:16:39 +00:00
libsanitizer Daily bump. 2022-05-06 00:16:26 +00:00
libssp Update copyright years. 2022-01-03 10:42:10 +01:00
libstdc++-v3 Daily bump. 2022-05-28 00:16:40 +00:00
libvtv Update copyright years. 2022-01-03 10:42:10 +01:00
lto-plugin Daily bump. 2022-05-05 00:16:29 +00:00
maintainer-scripts Daily bump. 2022-05-21 00:16:32 +00:00
zlib Daily bump. 2021-12-17 00:16:20 +00:00
.dir-locals.el dir-locals: Use https for bug references 2021-07-20 11:40:34 +01:00
.gitattributes
.gitignore Vim swap files not ignored 2022-05-28 09:38:29 -06:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2022-05-29 00:16:31 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess config.sub, config.guess : Import upstream 2021-01-25. 2021-02-23 17:21:10 +08:00
config.rpath
config.sub config.sub: change mode to 755. 2021-12-21 09:10:57 +01:00
configure LoongArch Port: Regenerate configure 2022-03-29 17:43:32 +08:00
configure.ac LoongArch Port: Regenerate configure 2022-03-29 17:43:32 +08:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4 Revert "Sync with binutils: GCC: Pass --plugin to AR and RANLIB" 2021-12-15 20:45:58 -08:00
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS MAINTAINERS: Add myself to write after approval 2022-05-13 09:30:38 -05:00
Makefile.def toplevel: Makefile.def: Make configure-sim depend on all-readline 2022-03-09 20:54:37 +01:00
Makefile.in toplevel: Makefile.def: Make configure-sim depend on all-readline 2022-03-09 20:54:37 +01:00
Makefile.tpl Revert "Sync with binutils: GCC: Pass --plugin to AR and RANLIB" 2021-12-15 20:45:58 -08:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.