With the ripgrep (rg) command

Description

recursively searches directories for a regex pattern while respecting your gitignore

ripgrep (rg)

ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern.

By default, ripgrep will respect gitignore rules and automatically skip hidden files/directories and binary files. (To disable all automatic filtering by default, use rg -uuu.)

ripgrep has first class support on Windows, macOS and Linux, with binary downloads available for every release.

ripgrep is similar to other popular search tools like The Silver Searcher, ack and grep.

Examples

Manual filtering

$ rg clap -g '*.toml' # we could limit ourselves to TOML files,
                      # which is how dependencies are communicated
                      # to Rust's build tool, Cargo

If we wanted, we could tell ripgrep to search anything but *.toml files

$ rg clap -g '!*.toml'
[lots of results]

Manual filtering: file types

Over time, you might notice that you use the same glob patterns over and over. For example, you might find yourself doing a lot of searches where you only want to see results for Rust files:

$ rg 'fn run' -g '*.rs'

Instead of writing out the glob every time, you can use ripgrep’s support for file types:

$ rg 'fn run' --type rust

or, more succinctly

$ rg 'fn run' -trust

The way the –type flag functions is simple. It acts as a name that is assigned to one or more globs that match the relevant files.

This lets you write a single type that might encompass a broad range of file extensions. For example, if you wanted to search C files, you’d have to check both C source files and C header files:

$ rg 'int main' -g '*.{c,h}'

or you could just use the C file type

$ rg 'int main' -tc

Just as you can write blacklist globs, you can blacklist file types too:

$ rg clap --type-not rust

or, more succinctly,

$ rg clap -Trust

That is, -t means “include files of this type” where as -T means “exclude files of this type.”

To see the globs that make up a type, run rg –type-list:

$ rg --type-list | rg '^make:'
make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile

By default, ripgrep comes with a bunch of pre-defined types. Generally, these types correspond to well known public formats. But you can define your own types as well.

For example, perhaps you frequently search “web” files, which consist of JavaScript, HTML and CSS:

$ rg --type-add 'web:*.html' --type-add 'web:*.css' --type-add 'web:*.js' -tweb title

or, more succinctly,

$ rg --type-add 'web:*.{html,css,js}' -tweb title

The above command defines a new type, web, corresponding to the glob *.{html,css,js}.

It then applies the new filter with -tweb and searches for the pattern title. If you ran

$ rg --type-add 'web:*.{html,css,js}' --type-list

Then you would see your web type show up in the list, even though it is not part of ripgrep’s built-in types.

It is important to stress here that the –type-add flag only applies to the current command. It does not add a new file type and save it somewhere in a persistent form. If you want a type to be available in every ripgrep command, then you should either create a shell alias:

alias rg="rg --type-add 'web:*.{html,css,js}'"

or add –type-add=web:*.{html,css,js} to your ripgrep configuration file. (Configuration files are covered in more detail later.)

The special all file type

A special option supported by the –type flag is all. –type all looks for a match in any of the supported file types listed by –type-list, including those added on the command line using –type-add.

It’s equivalent to the command rg –type agda –type asciidoc –type asm …, where … stands for a list of –type flags for the rest of the types in –type-list.

As an example, let’s suppose you have a shell script in your current directory, my-shell-script, which includes a shell library, my-shell-library.bash.

Both rg –type sh and rg –type all would only search for matches in my-shell-library.bash, not my-shell-script, because the globs matched by the sh file type don’t include files without an extension.

On the other hand, rg –type-not all would search my-shell-script but not my-shell-library.bash.

Replacements

ripgrep provides a limited ability to modify its output by replacing matched text with some other text.

This is easiest to explain with an example. Remember when we searched for the word fast in ripgrep’s README?

$ rg fast README.md
75:  faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
88:  color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
119:### Is it really faster than everything else?
124:Summarizing, `ripgrep` is fast because:
129:  optimizations to make searching very fast.

What if we wanted to replace all occurrences of fast with FAST? That’s easy with ripgrep’s –replace flag:

$ rg fast README.md --replace FAST
75:  FASTer than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
88:  color and full Unicode support. Unlike GNU grep, `ripgrep` stays FAST while
119:### Is it really FASTer than everything else?
124:Summarizing, `ripgrep` is FAST because:
129:  optimizations to make searching very FAST.

or, more succinctly,

$ rg fast README.md -r FAST
[snip]

Configuration file

It is possible that ripgrep’s default options aren’t suitable in every case.

For that reason, and because shell aliases aren’t always convenient, ripgrep supports configuration files.

Setting up a configuration file is simple. ripgrep will not look in any predetermined directory for a config file automatically.

Instead, you need to set the RIPGREP_CONFIG_PATH environment variable to the file path of your config file.

Once the environment variable is set, open the file and just type in the flags you want set automatically. There are only two rules for describing the format of the config file:

  1. Every line is a shell argument, after trimming whitespace.

  2. Lines starting with # (optionally preceded by any amount of whitespace) are ignored.

In particular, there is no escaping. Each line is given to ripgrep as a single command line argument verbatim.

Here’s an example of a configuration file, which demonstrates some of the formatting peculiarities:

$ cat $HOME/.ripgreprc
# Don't let ripgrep vomit really long lines to my terminal, and show a preview.
--max-columns=150
--max-columns-preview

# Add my 'web' type.
--type-add
web:*.{html,css,js}*

# Using glob patterns to include/exclude files or folders
--glob=!git/*

# or
--glob
!git/*

# Set the colors.
--colors=line:none
--colors=line:style:bold

# Because who cares about case!?
--smart-case

Preprocessor

In ripgrep, a preprocessor is any type of command that can be run to transform the input of every file before ripgrep searches it.

This makes it possible to search virtually any kind of content that can be automatically converted to text without having to teach ripgrep how to read said content.

One common example is searching PDFs. PDFs are first and foremost meant to be displayed to users. But PDFs often have text streams in them that can be useful to search. In our case, we want to search Bruce Watson’s excellent dissertation, Taxonomies and Toolkits of Regular Language Algorithms. After downloading it, let’s try searching it:

$ rg 'The Commentz-Walter algorithm' 1995-watson.pdf

Surely, a dissertation on regular language algorithms would mention Commentz-Walter. Indeed it does, but our search isn’t picking it up because PDFs are a binary format, and the text shown in the PDF may not be encoded as simple contiguous UTF-8.

Namely, even passing the -a/–text flag to ripgrep will not make our search work.

One way to fix this is to convert the PDF to plain text first.

This won’t work well for all PDFs, but does great in a lot of cases. (Note that the tool we use, pdftotext, is part of the poppler PDF rendering library.)

$ pdftotext 1995-watson.pdf > 1995-watson.txt
$ rg 'The Commentz-Walter algorithm' 1995-watson.txt
316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
7165:4.4 The Commentz-Walter algorithms
10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
17297: The Commentz-Walter algorithms (CW). In all versions of the CW algorithms,
 a common program skeleton is used with di erent shift functions. The CW algorithms are

But having to explicitly convert every file can be a pain, especially when you have a directory full of PDF files.

Instead, we can use ripgrep’s preprocessor feature to search the PDF.

ripgrep’s –pre flag works by taking a single command name and then executing that command for every file that it searches.

ripgrep passes the file path as the first and only argument to the command and also sends the contents of the file to stdin.

So let’s write a simple shell script that wraps pdftotext in a way that conforms to this interface:

$ cat preprocess
#!/bin/sh
exec pdftotext - -

With preprocess in the same directory as 1995-watson.pdf, we can now use it to search the PDF:

$ rg --pre ./preprocess 'The Commentz-Walter algorithm' 1995-watson.pdf
316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
7165:4.4 The Commentz-Walter algorithms
10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
17297: The Commentz-Walter algorithms (CW). In all versions of the
CW algorithms, a common program skeleton is used with di erent shift functions. The CW algorithms are

Exemples de Stéphane Robert

rg Test README.md                      # Recherche en utilisant la casse dans un fichier
rg -i test                             # Recherche en ignorant la casse dans tous les sous-répertoires
rg '^\*/s.*'                           # Recherche en utilisant une expression régulière.
rg -tpy for                            # Recherche de toutes les boucles for dans les fichiers en langage Python
rg -Tpy for                            # Recherche de toutes les boucles for dans les fichiers autres que Python
rg '^\*\s' -g '*.md'                   # Recherche par glob. On peut les cumuler et utiliser les négations avec le caratère !