I’m working on a certain program where I need to do different things depending on the extension of the file. Could I just use this?
if m == *.mp3
...
elif m == *.flac
...
nbro
15.1k31 gold badges110 silver badges196 bronze badges
asked May 5, 2011 at 14:34
1
Assuming m
is a string, you can use endswith
:
if m.endswith('.mp3'):
...
elif m.endswith('.flac'):
...
To be case-insensitive, and to eliminate a potentially large else-if chain:
m.lower().endswith(('.png', '.jpg', '.jpeg'))
nbro
15.1k31 gold badges110 silver badges196 bronze badges
answered May 5, 2011 at 14:37
lafraslafras
8,6324 gold badges29 silver badges28 bronze badges
6
os.path
provides many functions for manipulating paths/filenames. (docs)
os.path.splitext
takes a path and splits the file extension from the end of it.
import os
filepaths = ["/folder/soundfile.mp3", "folder1/folder/soundfile.flac"]
for fp in filepaths:
# Split the extension from the path and normalise it to lowercase.
ext = os.path.splitext(fp)[-1].lower()
# Now we can simply use == to check for equality, no need for wildcards.
if ext == ".mp3":
print fp, "is an mp3!"
elif ext == ".flac":
print fp, "is a flac file!"
else:
print fp, "is an unknown file format."
Gives:
/folder/soundfile.mp3 is an mp3! folder1/folder/soundfile.flac is a flac file!
answered May 5, 2011 at 15:46
AcornAcorn
48.6k26 gold badges131 silver badges172 bronze badges
2
Use pathlib
From Python3.4 onwards.
from pathlib import Path
Path('my_file.mp3').suffix == '.mp3'
If you are working with folders that contain periods, you can perform an extra check using
Path('your_folder.mp3').is_file() and Path('your_folder.mp3').suffix == '.mp3'
to ensure that a folder with a .mp3
suffix is not interpreted to be an mp3 file.
Kraigolas
5,0813 gold badges10 silver badges37 bronze badges
answered Jun 11, 2018 at 5:31
GregGreg
1,85418 silver badges11 bronze badges
0
Look at module fnmatch. That will do what you’re trying to do.
import fnmatch
import os
for file in os.listdir('.'):
if fnmatch.fnmatch(file, '*.txt'):
print file
answered May 5, 2011 at 15:03
John Gaines Jr.John Gaines Jr.
11.1k1 gold badge25 silver badges25 bronze badges
or perhaps:
from glob import glob
...
for files in glob('path/*.mp3'):
do something
for files in glob('path/*.flac'):
do something else
BenMorel
33.9k49 gold badges178 silver badges317 bronze badges
answered May 5, 2011 at 14:58
phynfophynfo
4,7901 gold badge24 silver badges38 bronze badges
one easy way could be:
import os
if os.path.splitext(file)[1] == ".mp3":
# do something
os.path.splitext(file)
will return a tuple with two values (the filename without extension + just the extension). The second index ([1]) will therefor give you just the extension. The cool thing is, that this way you can also access the filename pretty easily, if needed!
answered Mar 7, 2017 at 9:11
upgrdupgrd
6987 silver badges16 bronze badges
An old thread, but may help future readers…
I would avoid using .lower() on filenames if for no other reason than to make your code more platform independent. (linux is case sensistive, .lower() on a filename will surely corrupt your logic eventually …or worse, an important file!)
Why not use re? (Although to be even more robust, you should check the magic file header of each file…
How to check type of files without extensions in python? )
import re
def checkext(fname):
if re.search('.mp3$',fname,flags=re.IGNORECASE):
return('mp3')
if re.search('.flac$',fname,flags=re.IGNORECASE):
return('flac')
return('skip')
flist = ['myfile.mp3', 'myfile.MP3','myfile.mP3','myfile.mp4','myfile.flack','myfile.FLAC',
'myfile.Mov','myfile.fLaC']
for f in flist:
print "{} ==> {}".format(f,checkext(f))
Output:
myfile.mp3 ==> mp3
myfile.MP3 ==> mp3
myfile.mP3 ==> mp3
myfile.mp4 ==> skip
myfile.flack ==> skip
myfile.FLAC ==> flac
myfile.Mov ==> skip
myfile.fLaC ==> flac
answered Dec 15, 2017 at 6:08
Dan F.Dan F.
732 silver badges4 bronze badges
You should make sure the “file” isn’t actually a folder before checking the extension. Some of the answers above don’t account for folder names with periods. (folder.mp3
is a valid folder name).
Checking the extension of a file:
import os
file_path = "C:/folder/file.mp3"
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
if file_extension.lower() == ".mp3":
print("It's an mp3")
if file_extension.lower() == ".flac":
print("It's a flac")
Output:
It's an mp3
Checking the extension of all files in a folder:
import os
directory = "C:/folder"
for file in os.listdir(directory):
file_path = os.path.join(directory, file)
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
print(file, "ends in", file_extension)
Output:
abc.txt ends in .txt
file.mp3 ends in .mp3
song.flac ends in .flac
Comparing file extension against multiple types:
import os
file_path = "C:/folder/file.mp3"
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
if file_extension.lower() in {'.mp3', '.flac', '.ogg'}:
print("It's a music file")
elif file_extension.lower() in {'.jpg', '.jpeg', '.png'}:
print("It's an image file")
Output:
It's a music file
answered Nov 12, 2020 at 20:13
StevoisiakStevoisiak
23k27 gold badges120 silver badges222 bronze badges
import os
source = ['test_sound.flac','ts.mp3']
for files in source:
fileName,fileExtension = os.path.splitext(files)
print fileExtension # Print File Extensions
print fileName # It print file name
answered May 17, 2016 at 10:42
#!/usr/bin/python
import shutil, os
source = ['test_sound.flac','ts.mp3']
for files in source:
fileName,fileExtension = os.path.splitext(files)
if fileExtension==".flac" :
print 'This file is flac file %s' %files
elif fileExtension==".mp3":
print 'This file is mp3 file %s' %files
else:
print 'Format is not valid'
skrrgwasme
9,31011 gold badges53 silver badges84 bronze badges
answered Sep 18, 2014 at 20:53
npraknprak
3434 gold badges6 silver badges18 bronze badges
if (file.split(".")[1] == "mp3"):
print "its mp3"
elif (file.split(".")[1] == "flac"):
print "its flac"
else:
print "not compat"
answered Jul 19, 2017 at 7:27
2
If your file is uploaded then
import os
file= request.FILES['your_file_name'] #Your input file_name for your_file_name
ext = os.path.splitext(file.name)[-1].lower()
if ext=='.mp3':
#do something
elif ext=='.xls' or '.xlsx' or '.csv':
#do something
else:
#The uploaded file is not the required format
answered Nov 20, 2021 at 1:13
Merrin KMerrin K
1,5501 gold badge15 silver badges27 bronze badges
file='test.xlsx'
if file.endswith('.csv'):
print('file is CSV')
elif file.endswith('.xlsx'):
print('file is excel')
else:
print('none of them')
Andrea Moro
6192 gold badges8 silver badges19 bronze badges
answered May 29, 2019 at 11:57
1
I’m surprised none of the answers proposed the use of the pathlib
library.
Of course, its use is situational but when it comes to file handling or stats pathlib
is gold.
Here’s a snippet:
import pathlib
def get_parts(p: str or pathlib.Path) -> None:
p_ = pathlib.Path(p).expanduser().resolve()
print(p_)
print(f"file name: {p_.name}")
print(f"file extension: {p_.suffix}")
print(f"file extensions: {p_.suffixes}n")
if __name__ == '__main__':
file_path = 'conf/conf.yml'
arch_file_path = 'export/lib.tar.gz'
get_parts(p=file_path)
get_parts(p=arch_file_path)
and the output:
/Users/hamster/temp/src/pro1/conf/conf.yml
file name: conf.yml
file extension: .yml
file extensions: ['.yml']
/Users/hamster/temp/src/pro1/conf/lib.tar.gz
file name: lib.tar.gz
file extension: .gz
file extensions: ['.tar', '.gz']
answered Nov 23, 2022 at 9:29
Gergely MGergely M
5544 silver badges11 bronze badges
Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
In this article, we will cover How to extract file extensions using Python.
How to Get File Extension in Python?
Get File Extension in Python we can use either of the two different approaches discussed below:
- Use the os.path Module to Extract Extension From File in Python
- Use the pathlib Module to Extract Extension From File in Python
Method 1: Using Python os module splittext() function
This function splittext() splits the file path string into the file name and file extension into a pair of root and extension such that when both are added then we can retrieve the file path again (file_name + extension = path). This function is preferred use when the OS module is being used already.
Python3
import
os
split_tup
=
os.path.splitext(
'my_file.txt'
)
print
(split_tup)
file_name
=
split_tup[
0
]
file_extension
=
split_tup[
1
]
print
(
"File Name: "
, file_name)
print
(
"File Extension: "
, file_extension)
Output:
('my_file', '.txt') File Name: my_file File Extension: .txt
Method 2: Using Pathlib module
The pathlib.Path().suffix method of the Pathlib module can be used to extract the extension of the file path. This method is preferred for an object-oriented approach.
Python3
import
pathlib
file_extension
=
pathlib.Path(
'my_file.txt'
).suffix
print
(
"File Extension: "
, file_extension)
Output:
File Extension: .txt
Last Updated :
30 Aug, 2022
Like Article
Save Article
In Python, we can extract the file extension using two approaches. Let’s take a look at each of these with examples.
Python get file extension using os module splitext()
function
The os module has extensive functions for interacting with the operating system. The OS module can be used to easily create, modify, delete, and fetch file contents or directories.
Syntax: os.path.splitext(path)
The function splitext()
will take the path as an argument and return the tuple with filename and extension.
import os
# returns tuple wit filename and extension
file_details = os.path.splitext('/home/usr/sample.txt')
print("File Details ",file_details)
# extract the file name and extension
file_name = file_details[0]
file_extension = file_details[1]
print("File Name: ", file_name)
print("File Extension: ", file_extension)
Output
File Details ('/home/usr/sample', '.txt')
File Name: /home/usr/sample
File Extension: .txt
Python get file extension using pathlib module
pathlib module comes as a standard utility module in Python and offers classes representing filesystem paths with semantics appropriate for different operating systems.
pathlib.path().suffix
method can be used to extract the extension of the given file path.
import pathlib
# pathlib function which returns the file extension
file_extension = pathlib.Path('/home/usr/sample.txt').suffix
print("The given File Extension is : ", file_extension)
Output
The given File Extension is : .txt
What if your extension is like sample.tar.gz
with multiple dots, and if you use the above methods, you will only get the last part of the extension, not the full extension.
You can use the pathlib
module with suffixes
property which returns all the extensions as a list. Using that, we can join into a single string, as shown below.
import pathlib
# pathlib function which returns the file extension
file_extension = pathlib.Path('/home/usr/sample.tar.gz').suffixes
print("File extension ", file_extension)
print("The given File Extension is : ", ''.join(file_extension))
Output
File extension ['.tar', '.gz']
The given File Extension is : .tar.gz
Srinivas Ramakrishna is a Solution Architect and has 14+ Years of Experience in the Software Industry. He has published many articles on Medium, Hackernoon, dev.to and solved many problems in StackOverflow. He has core expertise in various technologies such as Microsoft .NET Core, Python, Node.JS, JavaScript, Cloud (Azure), RDBMS (MSSQL), React, Powershell, etc.
Sign Up for Our Newsletters
Subscribe to get notified of the latest articles. We will never spam you. Be a part of our ever-growing community.
By checking this box, you confirm that you have read and are agreeing to our terms of use regarding the storage of the data submitted through this form.
In this tutorial, you’ll learn how to use Python to get a file extension. You’ll accomplish this using both the pathlib
library and the os.path
module.
Being able to work with files in Python in an easy manner is one of the languages greatest strength. You could, for example use the glob
library to iterate over files in a folder. When you do this, knowing what the file extension of each file may drive further decisions. Because of this, knowing how to get a file’s extension is an import skill! Let’s get started learning how to use Python to get a file’s extension, in Windows, Mac, and Linux!
The Quick Answer: Use Pathlib
Using Python Pathlib to Get a File’s Extension
The Python pathlib library makes it incredibly easy to work with and manipulate paths. Because of this, it makes perfect sense that the library would have the way of accessing a file’s extension.
The pathlib
library comes with a class named Path
, which we use to create path-based objects. When we load our file’s path into a Path object, we can access specific attributes about the object by using its built-in properties.
Let’s see how we can use the pathlib
library in Python to get a file’s extension:
# Get a file's extension using pathlib
import pathlib
file_path = "/Users/datagy/Desktop/Important Spreadsheet.xlsx"
extension = pathlib.Path(file_path).suffix
print(extension)
# Returns: .xlsx
We can see here that we passed a file’s path into the Path
class, creating a Path object. After we did this, we can access different attributes, including the .suffix
attribute. When we assigned this to a variable named extension
, we printed it, getting .xlsx
back.
This method works well for both Mac and Linux computers. When you’re working with Windows, however, the file paths operate a little differently.
Because of this, when using Windows, create your file path as a “raw” string. But how do you do this? Simply prefix your string with a r
, like this r'some string'
. This will let Python know to not use the backslashes as escape characters.
Now that we’ve taken a look at how to use pathlib
in Python to get a file extension, let’s explore how we can do the same using the os.path
module.
Want to learn more? Want to learn how to use the pathlib
library to automatically rename files in Python? Check out my in-depth tutorial and video on Towards Data Science!
Using os.path in Python to Get a File’s Extension
The os.path
module allows us to easily work with, well, our operating system! The path
module let’s us use file paths in different ways, including allowing us to get a file’s extension.
The os.path
module has a helpful function, splitext()
, which allows us to split file-paths into their individual components. Thankfully, splitext()
is a smart function that knows how to separate out file extensions, rather than simply splitting a string.
Let’s take a look at how we can use the splitext()
function to get a file’s extension:
# Get a file's extension using os.path
import os.path
file_path = "/Users/datagy/Desktop/Important Spreadsheet.xlsx"
extension = os.path.splitext(file_path)[-1]
print(extension)
# Returns: .xlsx
Let’s take a look at what we’ve done here:
- We import
os.path
. Rather than writingfrom os import path
, we use this form of import so that we can leave the variablepath
open and clear. - We load our
file_path
variable. Remember: if you’re Windows, make your file path a raw string, by pre-fixing anr
before the opening quotation mark. - Apply the
splitext()
function to the file path. We then access the item’s last item.
The splitext()
returns a tuple: the first part will be the filename, and the second will be its extension. Because of this, if we only want a file’s extension, we can just access the tuples last item.
How to Use a Python File Extension
Now that you’ve learned two different ways to use Python to get a file’s extension, how can you apply this?
One handy method is to act on, say, only Excel files. If you’re writing a for-loop, you could first check to see if a file is an Excel file and then load it into a Pandas dataframe. This approach would let you skip the files that may not actually contain any data.
Let’s see how to do this in Python and Pandas:
# Get a file's extension using os.path
import pathlib
import pandas as pd
file_paths = ["/Users/datagy/Desktop/Important Spreadsheet.xlsx", "/Users/datagy/Desktop/A Random Document.docx"]
df = pd.DataFrame()
for file in file_paths:
if pathlib.Path(file).suffix in ('.xls', '.xlsx'):
temp_df = pd.read_excel(file)
df = df.append(temp_df)
Now that you’ve learned a practical example, check out my other Pandas tutorials here, including how to calculate an average in Pandas and how to add day’s to a Pandas columns.
Conclusion
In this post, you learned how to use Python to get a file’s extension. You learned how to do this using both the pathlib
library as well as the os.path
module, using the splitext()
function. You learned how to do this in Windows, Mac and Linux, in order to ensure that your code can run across systems.
To learn more about the splitext()
function, check out the official documentation here.
Мы можем использовать функцию splitext() модуля os в Python, чтобы получить расширение файла. Эта функция разбивает путь к файлу на кортеж, имеющий два значения – корень и расширение.
Вот простая программа для получения расширения файла на Python.
import os # unpacking the tuple file_name, file_extension = os.path.splitext("/Users/pankaj/abc.txt") print(file_name) print(file_extension) print(os.path.splitext("/Users/pankaj/.bashrc")) print(os.path.splitext("/Users/pankaj/a.b/image.png"))
Выход:
- В первом примере мы напрямую распаковываем значения кортежа в две переменные.
- Обратите внимание, что файл .bashrc не имеет расширения. К имени файла добавляется точка, чтобы сделать его скрытым.
- В третьем примере в имени каталога есть точка.
Получение расширения файла с помощью модуля Pathlib
Мы также можем использовать модуль pathlib, чтобы получить расширение файла. Этот модуль был представлен в версии Python 3.4.
>>> import pathlib >>> pathlib.Path("/Users/pankaj/abc.txt").suffix '.txt' >>> pathlib.Path("/Users/pankaj/.bashrc").suffix '' >>> pathlib.Path("/Users/pankaj/.bashrc") PosixPath('/Users/pankaj/.bashrc') >>> pathlib.Path("/Users/pankaj/a.b/abc.jpg").suffix '.jpg' >>>
Всегда лучше использовать стандартные методы, чтобы получить расширение файла. Если вы уже используете модуль os, используйте метод splitext(). Для объектно-ориентированного подхода используйте модуль pathlib.
Получение размера файла
Мы можем получить размер файла в Python, используя модуль os.
Модуль os имеет функцию stat(), где мы можем передать имя файла в качестве аргумента. Эта функция возвращает структуру кортежа, содержащую информацию о файле. Затем мы можем получить его свойство st_size, чтобы получить размер файла в байтах.
Вот простая программа для печати размера файла в байтах и мегабайтах.
# get file size in python import os file_name = "/Users/pankaj/abcdef.txt" file_stats = os.stat(file_name) print(file_stats) print(f'File Size in Bytes is {file_stats.st_size}') print(f'File Size in MegaBytes is {file_stats.st_size / (1024 * 1024)}')
Выход:
Если вы посмотрите на функцию stat(), мы можем передать еще два аргумента: dir_fd и follow_symlinks. Однако они не реализованы для Mac OS.
Вот обновленная программа, в которой я пытаюсь использовать относительный путь, но выдает NotImplementedError.
# get file size in python import os file_name = "abcdef.txt" relative_path = os.open("/Users/pankaj", os.O_RDONLY) file_stats = os.stat(file_name, dir_fd=relative_path)
Выход:
Traceback (most recent call last): File "/Users/pankaj/.../get_file_size.py", line 7, in file_stats = os.stat(file_name, dir_fd=relative_path) NotImplementedError: dir_fd unavailable on this platform
( 1 оценка, среднее 5 из 5 )
Помогаю в изучении Питона на примерах. Автор практических задач с детальным разбором их решений.