2

For a university project, I have to count all the files in a folder. I have used the command:

find ./dirName | wc -l

Although when I compare this to the file count provided by Nautilus it is considerably more. See the screenshot below:

enter image description here

./dirName is actually a directory of files from a repository (SVN/GIT) and I need to find out how many files make up the system.

Could anyone explain why these differences occur and maybe tell me which one is more reliable?

danielcooperxyz
  • 642
  • 2
  • 6
  • 19

3 Answers3

2

Nautilus doesn't count hidden files.

Files and directories starting with a dot (.) are hidden in Linux.

Steps to reproduce:

mkdir somedir && cd somedir
touch .hidden .hidden2 regular regular2      # 4 files, 2 hidden
find . | wc -l                               # outputs 5 (4 files + dir itself)

Nautilus reports: Contents: 2 items, totalling 0 bytes

Using Git

Here's a quick demonstration on the amount of files for the metadata used in Git, all in the .git directory.

git init myrepo                              # Initialized [...] in myrepo/.git/
cd myrepo/
find . | wc -l                               # outputs 23! for an empty repository
tree -a                                      # outputs 10 directories, 12 files

echo "have to add something for git ls-tree" > somefile
git add somefile && git commit -m "Initial commit"
find . | wc -l                               # outputs 38 (!)
git ls-tree -r HEAD | wc -l                  # outputs 1

And also Nautilus reports 1 there.

My suggestion: use tree

As Gilles pointed out in his answer, using find and piping it to wc isn't overly reliable if the file names contain special characters.

It seems that tree is capable of doing this right:

tree -a
.
├── dir
│   └── regular3
├── dir2
├── .hidden
├── .hidden2
├── regular
└── regular2

2 directories, 5 files
gertvdijk
  • 69,427
1

There could be file names with newlines in them. Highly inadvisable, but technically possible. This may be what your exercise was about.

One way to reliably count the files under a directory is to make find print something that can be counted reliably, i.e. with one item per file.

find ./dirName -printf a | wc -c

Keep in mind that find includes dirName itself, and recurses into subdirectories.

If you only want the files inside dirName, without recursing, let the shell count them:

GLOBIGNORE=.:..
set -- *
echo $#
0

Try checking the output of find:

find somewhere | less

You'll see that find by default outputs any kind of file, without making distinctions based on the type or on the name. Nautilus instead does not count the starting directory (somewhere in the example) or files that it would not show when browsing.

To solve the issue, use the -type option of find:

find somewhere -type f | wc -l
find somewhere ! -type d | wc -l

The first line will look for all regular files. The second all the non-directory items (i.e. regular files, block devices, UNIX sockets, and so on). See man find for more information.

You might be probably interested in reading about -H, -L and -P, which control how find should handle symlinks (and therefore how symlinks influence the counts).