Job Scheduling File Naming Conventions

Job scheduling and batch processing often involve the processing of files. The approach taken to naming and organizing your jobs’ files can significantly impact your ability to find those files later and to understand what they contain. The Smithsonian lists five precepts for its file naming and organization to make it obvious where to find specific data and what the files contain:

  1. Have a distinctive, human-readable name that gives an indication of the content.
  2. Follow a consistent pattern that is machine-friendly.
  3. Organize files into directories (when necessary) that follow a consistent pattern.
  4. Avoid repetition of semantic elements among file and directory names.
  5. Have a file extension that matches the file format (no changing extensions!)

Directory Naming

Define and document a clear directory structure that includes information related to your jobs’ processing. Individual directories may be set up by date, job name, type of run, involved customers, or whatever makes sense for you and your workflow.

In defining directories, consider how many files may end up in each directory. Avoid directories containing many thousands of files as this may adversely affect the performance of your job’s processing. Try and distribute files across multiple directories to optimize performance.

Information for File Names

Choose a format for naming your files and use it consistently. You might consider including some of the following information in your file names, but you can include any information that will allow you to distinguish your files from one another.

  • Customer name or vendor name or acronym
  • Department or internal system designation or acronym
  • Date or date range of of the data contained in the files
  • Type of data, e.g., payments, images, video
  • Version number or revision identifier of file
  • Production or test file indicator
  • Is this file a resend of a prior file
  • Three-letter file extension for application-specific files

Formatting File Names

  • Standardize your date designations used in file names, such as YYYYMMDD or YYMMDD. These formats make it easier to review files since the dates will be in chronological order.
  • Avoid making file names too long since long file names do not work well with all types of software and operating systems.
  • Remember that some operating systems treat file names without regard to case, so the use of capitalizations within file names should be considered carefully.
  • Special characters such as ~ ! @ # $ % ^ & * ( ) ` ; < > ? , [ ] { } ‘ ” and | should be avoided
  • When using a sequential numbering system, use leading zeros for clarity and to make sure files sort in sequential order. For example, use “001, 002, …010, 011 … 100, 101, etc.” instead of “1, 2, …10, 11 … 100, 101, etc.”
  • Carefully consider the use of spaces in file names and directory names. Some software will not recognize file names with spaces, and file names with spaces must be enclosed in quotes when being processed using a command line. Options instead of spaces include:
    • Underscores, e.g.
    • Dashes, e.g.
    • No separation, e.g.
    • Camel case, where the first letter of each section of text is capitalized, e.g. (Again being careful to consider if the operating system can detect changes in case in the files’ names).

Documenting Your File Naming Conventions

Coming up with a file naming convention takes efforts and commitment, but remember to document your convention and ensure its adherence. Budget an effort to periodically audit directories and file names to ensure the file naming conventions are being followed. Revise your convention as needed as you encounter exceptions the convention did not address.

Sample File Name Best Practice Documents

Smithsonian Data Management Best Practices

Stanford Libraries – File Naming Best Practices

Staffing Industry Analysts – Adam Pode