geek girl, linux land

find: a cool command for sysadmins

One of my all-time favourite Linux commands as a sysadmin is ‘find‘. It’s flexible, powerful and it’s got me out of trouble many times.

find works by traversing a directory tree for files and folders that meet a specified search criterion.

For example, if you want to find a JPEG file called Cats.JPG under /home/suuze_linux:

$ find /home/suuze_linux -name Cats.JPG

What if you know the JPEG file’s name has the word ‘Cats’ in it, but you aren’t sure exactly what the rest of the name is? You can use wildcards:

$ find /home/suuze_linux -name "*Cats*.JPG"

The searches above are case-sensitive. If you’re not sure if the filename has ‘Cats’, ‘cAts’, ‘cats’, or any other combination of the letters, you could try:

$ find /home/suuze_linux -iname "cats.jpg"

find can search using many other kinds of criteria. To search for items that have changed in the last day:

$ find /home/suuze_linux -mtime -1

The ‘-‘ indicates we’re looking for all files with an inode modification time less than 1 day. If the ‘-‘ is left off, find will return matches whose inodes were modified exactly 1 day ago.

How about files that were created in the last week? No problem:

$ find /home/suuze_linux -ctime -7

In order to fully understand how find works, bear a couple of things in mind: One of the fundamental (and over-simplified) precepts of Linux and UNIX-like operating systems is that everything is a file. This is not strictly true. But the precept works for our purposes here. find uses inode data to look for matches. Everything in a Linux filesystem is associated with an inode (Index Node). Therefore (unless otherwise specified), find will return both files and folders that match.

Here’s another example. Perhaps your root filesystem is close to capacity. You want to find all files that are greater than 1 GiB in size:

$ sudo find / -size +1G

Again, if the ‘+’ was left off, find will return all files that are exactly 1GiB in size.

As an aside, GiB stands for gibibytes (i.e, 1024 * 1024 * 1024 bytes). This is different to GB which stands for gigabytes (i.e, 1000 * 1000 * 1000 bytes). Why do we care? Among other things, apparently a whole lot of lawsuits have been brought against hardware manufacturers over the difference between 1024³ and 1000³! You can read all about this at MASV’s article, GB vs GiB: What’s the Difference Between Gigabytes and Gibibytes?

The power of find is in its flexibility. Multiple search criteria can be combined. Let’s say your filesystem is filling up, and you know you have old ISO files lying around that can be got rid of. You could search for them with something like this:

$ sudo find / -iname "*.iso" -type f -mtime +365 -size +4G

Breaking it down:

–iname specifies a case-insensitive search.
‘*‘ is a wildcard which will match a string of length 0 or more.
-type specifies the file type – in this case, ‘f’ will constrain the search to regular files only.
–mtime specifies all inodes that were modified over 1 year ago.
–size specifies 4 gibibytes or larger.

The above command would give you a list of files and folders that match the search criteria – if any exist. The resulting list will be fully qualified. I.e, find will give you the complete path to each matching file or folder. You can then navigate to that file or folder, query it using ls, remove it using rm, etc.

However, find can make our lives even easier with the ‘-exec’ flag. Like so:

$ sudo find / -iname "*.iso" -type f -mtime +365 -size +4G -exec ls -l {} \;

Instead of returning a list of matching files, find will run the command that follows ‘-exec’ on each matching file. In this case, it will do a long listing of each. Neat, huh?

Note that the command above can be shortened as follows:

$ sudo find / -iname "*.iso" -type f -mtime +365 -size +4G -ls

The ‘-ls’ flag effectively is a ‘-exec ls -lids {} \;’ – it does a long listing on matching files and directories, displaying inode numbers and sizes in blocks.

A sequence of find commands such as the following is a handy addition to a Linux sysadmin’s toolbox.

$ sudo find / -iname "*.iso" -mtime +365 -size +4G -exec ls -ld {} \;

followed by:

$ sudo find / -iname "*.iso" -mtime +365 -size +4G -exec rm {} \;

or:

$ sudo find / -iname "*.iso" -mtime +365 -size +4G -delete

However, use this with caution! ‘rm‘ and the ‘-delete’ option both remove files permanently – there is no recycle bin option here. Once rm has removed a file on a Linux filesystem, the only way to get it back is to restore from a backup (if one exists!).

A cool thing about Linux is how the output of one command can be used in another. For instance…

if you want to change directory into a folder called ‘scripts’,
you know scripts is under your home directory somewhere but you can’t recall exactly where, and
you know there is only one ‘scripts’ folder sitting under your home directory tree somewhere,

… you could run something like this from your home directory:

$ cd "`find /home/suuze_linux -name scripts`"

A ‘pwd‘ should now show that your current working directory is the directory scripts and the path to it.

I used double quotes in the example above because I know the path to my scripts folder is likely to have folders with spaces in their names. If you knew for sure there are no folder names with spaces in the directory tree, you could leave out the double quotes. But leave the back ticks in – they are what causes the Linux shell to evaluate the find statement first, before passing the result to cd.

This post just scratches the surface of what find can do! To find out more:

$ man find

$ find --help

What do you think? What is your favourite find option? Feel free to comment below – and happy finding!

New to Docker?

Docker is a containerization system that has been around since 2013. For the uninitiated, it can be a complex beast to understand and conquer. Containerization systems like Docker and Kubernetes are key to rapidly deploying applications and services and lend themselves to DevOps and models like the increasingly popular continuous integration / continuous deployment.

If you are a new Docker user, you might find the following articles helpful:

freeCodeCamp.org’s Docker 101 tutorial: https://www.freecodecamp.org/news/docker-101-fundamentals-and-practice-edb047b71a51/
Docker’s own Docker 101 Q&A transcript: https://www.docker.com/blog/docker-101-getting-to-know-docker/

Happy reading!

“Love is chemistry” – Inma Martinez on AI

I’m fascinated by AI’s potential and promise.

And so I enjoyed this interview in EL PAÍS. Inma Martínez, an expert in artificial intelligence, chats about natural language processing (NLP), smart cars and whether technology can predict the future. Enjoy. 😊

https://english.elpais.com/usa/2021-12-07/love-is-chemistry-algorithms-fail-the-more-abstract-and-complicated-a-person-is.html

TaskJuggler: an open source project management tool

So my random Linux-related inspiration for the week has been to try out TaskJuggler.

My husband and I will be launching our older child into the world later this year, and there are a few convoluted tasks involved in making this happen. It’s doing my head in. So I went hunting for a Linux-based project management tool that I would find relatively clean and useful to work with.

TaskJuggler is an open source project management tool. I came across it in UbuntuPIT’s post “The 20 Best Project Management Tools for Linux Desktop” from February this year. So I’m hoping all its recommendations are up to date, though TaskJuggler itself doesn’t seem to have been updated since 2020.

TaskJuggler’s latest version doesn’t have a graphical user interface. Apparently version 2.x did, but the GUI hasn’t been ported across to 3.x. You can use a command shell, a plaintext editor (no fancy word processing software allowed) and a web browser. This might therefore only appeal to command-line nerds like me.

Installation

To install, I followed the documentation at https://taskjuggler.org. Basically one installs Ruby Gems (if not already done), then installs the Ruby Gems package TaskJuggler. If you like reading transcripts, here’s mine. Note that I didn’t install taskjuggler system-wide, just for my user account.

$ sudo dnf install rubygems
Last metadata expiration check: 1:09:23 ago on Fri 01 Apr 2022 14:39:29.
Dependencies resolved.
==============================================================================
 Package                 Arch        Version               Repository    Size
==============================================================================
Installing:
 rubygems                noarch      3.2.22-149.fc34       updates      247 k
Installing dependencies:
 ruby                    x86_64      3.0.2-149.fc34        updates       41 k
 ruby-libs               x86_64      3.0.2-149.fc34        updates      3.2 M
 rubygem-json            x86_64      2.5.1-201.fc34        fedora        64 k
 rubygem-psych           x86_64      3.3.0-149.fc34        updates       51 k
 rubypick                noarch      1.1.1-14.fc34         fedora       9.9 k
Installing weak dependencies:
 ruby-default-gems       noarch      3.0.2-149.fc34        updates       32 k
 rubygem-bigdecimal      x86_64      3.0.0-149.fc34        updates       54 k
 rubygem-bundler         noarch      2.2.22-149.fc34       updates      367 k
 rubygem-io-console      x86_64      0.5.7-149.fc34        updates       25 k
 rubygem-rdoc            noarch      6.3.1-149.fc34        updates      400 k

Transaction Summary
==============================================================================
Install  11 Packages

Total size: 4.4 M
Total download size: 4.3 M
Installed size: 16 M
Is this ok [y/N]: y
Downloading Packages:
:
:
Installed:
  ruby-3.0.2-149.fc34.x86_64                                                  
  ruby-default-gems-3.0.2-149.fc34.noarch                                     
  ruby-libs-3.0.2-149.fc34.x86_64                                             
  rubygem-bigdecimal-3.0.0-149.fc34.x86_64                                    
  rubygem-bundler-2.2.22-149.fc34.noarch                                      
  rubygem-io-console-0.5.7-149.fc34.x86_64                                    
  rubygem-json-2.5.1-201.fc34.x86_64                                          
  rubygem-psych-3.3.0-149.fc34.x86_64                                         
  rubygem-rdoc-6.3.1-149.fc34.noarch                                          
  rubygems-3.2.22-149.fc34.noarch                                             
  rubypick-1.1.1-14.fc34.noarch                                               

Complete!
$ 
$ gem install taskjuggler
Successfully installed taskjuggler-3.7.1
Parsing documentation for taskjuggler-3.7.1
Done installing documentation for taskjuggler after 3 seconds
1 gem installed
$ which tj3
~/bin/tj3
$  ruby -e "puts Gem::Specification.find_by_name('taskjuggler').gem_dir"
/home/<obscured>/.local/share/gem/ruby/gems/taskjuggler-3.7.1
$ tj3 --version
tj3 (TaskJuggler) 3.7.1
$ sudo find / -name taskjuggler
/home/<obscured>/.local/share/gem/ruby/gems/taskjuggler-3.7.1/lib/taskjuggler
$

Using it

TaskJuggler 3.x basically works like a compiler. The user sets up the project details in a source file and then passes them to the Task Juggler tj3 binary.

Taking TaskJuggler for a small test drive, I first created a test project using a plaintext editor. I used vi, but you could use anything – except a word processor.

$ cat test-project.tjp 
### Test project

project test_project "My Test Project 2022"  2022-01-01 +12m {
  timezone "Australia/Melbourne"
  currency "AUD"
  timeformat "%Y-%m-%d"
  numberformat "-" "" "," "." 1
  currencyformat "-" "" "," "." 0
}

### Define resources

resource parent1 "Parent1" {
  email "blah@blahs.com"
}

resource parent2 "Parent2" {
  email "blahmore@blahs.com"
}

### Tasks and subtasks

task uni_app "University application" {
}

task travel_oz "Travel to Oz" {

  task accom_transport "Oz accommodation and transport" {
  }

  task flights "Flights to Oz" {
  }

} 

### Reporting and exports. Taken from https://taskjuggler.org example. 
taskreport breakdown "ProjectBreakdown" {
  formats html
  caption "This is the project breakdown"
  headline "Project Breakdown"
  columns name, start, end, daily
  # Don't hide any resource, meaning show them all.
  hideresource 0
}

# Export the project as fully scheduled project. Taken from https://taskjuggler.org example.
export "FullProject" {
  definitions *
  taskattributes *
  hideresource 0
}
$

The next step is to compile the source code file using TaskJuggler:

$ ~/bin/tj3 test-project.tjp 
TaskJuggler v3.7.1 - A Project Management Software

Copyright (c) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
              by Chris Schlaeger <cs@taskjuggler.org>

This program is free software; you can redistribute it and/or modify it under
the terms of version 2 of the GNU General Public License as published by the
Free Software Foundation.

Reading file test-project.tjp                                [      Done      ]
Preparing scenario Plan Scenario                             [      Done      ]
Scheduling scenario Plan Scenario                            [      Done      ]
Checking scenario Plan Scenario                              [      Done      ]
Report ProjectBreakdown                                      [      Done      ]
Report FullProject                                           [      Done      ]
$
$ ls -la
total 52
drwxrwxr-x. 1 <obscured> <obscured>   186 Apr  1 17:47 .
drwxrwxr-x. 1 <obscured> <obscured>    62 Apr  1 17:41 ..
drwxrwxr-x. 1 <obscured> <obscured>    48 Apr  1 17:45 css
-rw-rw-r--. 1 <obscured> <obscured>  1515 Apr  1 17:47 FullProject.tjp
drwxrwxr-x. 1 <obscured> <obscured>   284 Apr  1 17:45 icons
-rw-rw-r--. 1 <obscured> <obscured> 17200 Apr  1 17:47 ProjectBreakdown.html
drwxrwxr-x. 1 <obscured> <obscured>    26 Apr  1 17:45 scripts
-rw-rw-r--. 1 <obscured> <obscured>  1102 Apr  1 17:47 test-project.tjp
$

Opening the file ProjectBreakdown.html in a browser shows this:

And that’s it.

The blurb from TaskJuggler’s home page says:

TaskJuggler is project management software for serious project managers. It covers the complete spectrum of project management tasks from the first idea to the completion of the project. It assists you during project scoping, resource assignment, cost and revenue planning, risk and communication management.
https://taskjuggler.org/index.html

It’s usefulness as a collaborative resource would be limited. There are far fancier and WYSIWYG tools around. But it has potential for someone like me who is looking for a reasonably simple text-based personal project management solution without all the overhead that a WYSIWYG solution brings.

Image credits TaskJuggler.org.

Coding: What if you don’t know English?

So I’ve been working on a little project recently. I’m creating an introductory Python course based on the Foundations of Python Programming from Runestone Academy, but modifying it for high school students whose primary language isn’t English.

As I ran a pilot of the course, I realised that my test pupil is somewhat disadvantaged. She knows English, but it isn’t her first language. I began to wish I could present her with a programming language whose tokens look like those from her first language. How much easier it would be for her to learn to code!

And then I realised how biased the world of programming is towards English.

The article “Coding Is for Everyone—as Long as You Speak English” by Gretchen McCulloch on Wired.com mirrors my frustrations.

There needs to be a shift. I wonder what it might look like for this shift to be implemented. It might need a completely new tool we haven’t dreamed of yet. Or something we already have that can be repurposed.

Perhaps something akin to natural language processing that can translate source code into language tokens that align with different natural languages?

That article again from Wired.com is “Coding Is for Everyone—as Long as You Speak English.” Have a read and let me know your thoughts!

#BreakTheBias: Happy IWD 2022!

This International Women’s Day, we are being encouraged to Break The Bias.

I’m a woman who loves working in the tech industry, but often thinks I am less competent than male colleagues. Heck – I even think I’m less competent in tech than men who don’t work in the field, but can spout confident techie lingo.

So I love organisations like Girls In Tech whose mission is to encourage and support women working in tech roles. Girls In Tech run courses and provide resources for women in tech – and I’ve benefited from their training.

Head over to their article on #BreakTheBias to see what they have to say about IWD 2022.

I fell into open source by accident

So how did I start on this open source journey?

In the late 1990s, I was a fresh Computer Science graduate. Australia had recently come out of a recession, and the market was starting to be glutted with graduates like me. I was expecting to become a software developer. But about a hundred job applications and five interviews later, I got an entry-level sysadmin role. I became part of a team managing Windows file and print servers, one midrange server that ran an operating system that was neither Windows nor UNIX… and one brand spanking new Sun Microsystems midrange computer running SunOS 4. It became my first UNIX love.

Not long after, the I.T. infrastructure management was outsourced to a managed services provider (MSP). The new organisation had a strictly silo’ed approach. Our team of sysadmins was split across different specialities. I ended up in the Midrange/UNIX team.

And that’s how my specialisation in the UNIX and Linux space began.

International Women’s Day 2022’s theme is #BreakTheBias

We might get tired of hearing the same old report of how under-represented women are in science and technology. And is it really true, and why does it matter anyway?

Well, here’s a real life example. I worked for almost 25 years in tech. At least 24 of those years were focused on UNIX and, later, Linux. In all those years, I’ve had a grand total of 3 other female colleagues working in UNIX and Linux with me. I’ve had more female managers than colleagues.

That’s not to say there weren’t many women in the organisations I worked for. Just that the majority of them were in non-technical roles. Even database and app-development teams I’ve worked with have had a significantly large male majority, though they had more female representation than the UNIX and Linux space had.

There are many theories out there on this gender disparity in tech. Future post coming up on this – and also why this gender disparity matters anyway.

For now, ’nuff said. Bring on #BreakTheBias.

Women in Open Source

On 8 March, we celebrate International Women’s Day

IWD 2022’s theme is #BreakTheBias. It would be helpful in its lead-up to look at gender diversity in tech. Today, I’m taking a look at open source in particular.

Back in March 2021, /* Academy Software Foundation posted an article titled “The Truth about Women in Open Source” by Nithya Ruff. I found this encouraging and informative. The article details some stereotypes we have about women’s involvement in open source, and goes on to evaluate their truth and where some more work needs to be done to bridge gender gaps.

The article finishes with several resources that aspiring open source contributors might find helpful.

I also found it helpful to read up on Nithya’s biography here in a write-up on Behind the Tech.

If like me you identify as female and work (or used to work) in tech, what would you agree with, or disagree with, in Nithya’s article? Also, have you ever worked in open source, and what was your experience like?

Image credit ThisIsEngineering from Pexels.com.

Security Tip # 6 – Multi-Factor Authentication

This post is part of a series on the top 10 things I look at when securing my home Linux installations. You can find the other posts here.

Tips 1 through to 5 generally apply system-wide – that is, they are system configuration choices you will make. Tips 6 through to 10 are more per-user choices. This distinction won’t make much difference in a home environment where each device is dedicated to a single user. It will begin to be apparent in an environment where more than one user users a device.

Use multi-factor authentication (MFA)

Multi-factor authentication, or MFA, is designed to prove you are who you say you are. A service using MFA for authentication will require you to provide at least two factors in order to login. For example,

something you know (eg, a username-password comb), and
something you have (an access code on a separate device or a token)

Two-Factor Authentication (or 2FA) is MFA that requires exactly two factors to be provided for authentication.

Multi-factor authentication is not the perfect solution to security woes. There are tales of hackers working around it. However a malicious actor will have to work harder to bypass MFA rather than a single username-password combination.

MFA examples

Most major vendors use some form of MFA in their application. For instance you can set up your Facebook account to use an authentication app.

Examples of authentication apps are:

FreeOTP – open source, sponsorted and published by Red Hat
Twilio Authy app – free, but closed source, though Twilio does sponsor some open source projects. Twilio state that the app is free because it is paid for by businesses using the Authy API.
Google Authenticator app – free, but closed source (latter since 2013). It’s part of Google’s two-step verification option.

Major vendors like Apple, Google and Amazon may use another kind of MFA called a One Time Passcode (OTP) when users login – by sending a code via text message, email, or out to a registered device. You will then need to enter that code in order to proceed with login.

Recap of my top 10 tips for securing Linux @home

Enable and use an OS-level firewall
Enable SELinux or another Mandatory Access Control mechanism
Use sudo
Apply software updates automatically or often
Use encryption
Use multi-factor authentication
Enable threat-detection
Browse securely
Limit running services
Backup securely

Update: My remaining posts this week are on women in tech. I will be back another week with thoughts on Tip # 7 – Threat-detection. Meanwhile, like or comment to let me know what you thought of this tip!

A glossary of terms is available here.

Once again, ensure you’re familiar with the disclaimer here!

Featured image by SHVETS production from Pexels.com. Wave image from Pexels.com by DLKR

Security Tip # 5 – Encryption

This post is part of a series on my top 10 tips for securing Linux home installations. You can find the other posts here.

A bit of a preamble

I’ve learned that IT security is like physical security: we have to know our context, understand the threats in it and secure accordingly. Each person’s security needs are different. No-one can give a one-size-fits-all solution to security, least of all for securing our Linux devices at home.

However some basic concepts are handy. The tips in this series follow some basic security principles I’ve adopted for myself.

Use encryption

This is a big topic, worthy of several posts of its own. Encryption in I.T. is essentially scrambling data using cryptography, so that it cannot be read without the correct decryption keys. A few years ago encryption would have been considered overkill on a home system. Now it is increasingly standard across most I.T. products and solutions.

Most home users will encounter two areas where encryption applies:

at rest, and
in transit.

Encryption at rest

Encryption at rest is cryptographic scrambling of data where it is are stored, whether on locally on a system or externally. External storage would include on-premise and cloud-based storage.

My take is that encryption of stored cloud-hosted data is absolutely critical, and I would strongly recommend encryption of your other storage too. You would have to weigh the potential risk (for example, of data loss or data being made public if your physical device is stolen) against the effort involved (and the possible risk of data loss if, for instance, you forget your decryption passphrase!).

Storage encryption is usually easily implemented these days, and much easier done during OS installation rather than retrofitted. On Fedora and Ubuntu, you can choose to encrypt your local devices during partitioning & filesystem layout at installation.

Encryption in transit

Encryption of data during transmission is critically important, with great strides being made in some areas and little in others. Without encryption in transit, data being transferred can be easily read at various points during transmission.

As an example, many emails sent today are in clear text, and easily intercepted and read during transmission. Given this, there is increasing interest in the encryption of emails during transmission. But it’s still awkward for home users to achieve fully end to end. For now my recommendation is to avoid sending private information in clear text by email.

Web traffic is different. Most of us now know to use secure HTTP (https) to connect to websites, and to look for the padlock icon next to a URL to ensure our connection is encrypted. This also is a huge topic as there are many ways of bypassing these measures, or of malicious actors setting up sites masquerading as genuine ones. But I will leave it there for now.

Find out more

That’s it! Stay tuned for tomorrow’s discussion on multi-factor authentication. Meanwhile, like or comment to let me know what you think of this tip!

A glossary of terms is available here.

Once again, ensure you’re familiar with the disclaimer here!

Sources

The post references documentation and articles on fedoraproject.org, wikipedia.com, pcmag.com. Sources are linked to within the post’s content above.

Featured image by cottonbro from Pexels.com.