Skip to content

Feature Request: Add support for WMV (Windows Media Video) and ASF formats #1361

Description

@kondomodoru

Is your feature request related to a problem? Please describe.

Yes, my feature request is related to a problem.

When using Magika to identify file types on our upload server, we've found that it cannot detect Windows Media Video (.wmv) files. These are identified as unknown or application/octet-stream. Since WMV files are based on the Advanced Systems Format (ASF) container, other related formats like WMA (Windows Media Audio) are also likely not detected.

While we can use other tools like libmagic or TrID as a fallback, we would prefer to use Magika as our primary tool due to its speed and resilience against spoofed extensions. Having native support for WMV/ASF would greatly simplify our file validation workflow.

Describe the solution you'd like

I would like Magika's model to be trained to correctly identify files in the ASF (Advanced Systems Format) container, which includes:

WMV (Windows Media Video)
WMA (Windows Media Audio)
ASF itself
The expected content type labels would be something like wmv and wma, with corresponding MIME types such as video/x-ms-wmv and audio/x-ms-wma.

Describe alternatives you've considered

We have considered and are currently implementing a hybrid approach where we use Magika first, and if it returns unknown, we fall back to another tool like python-magic (libmagic) to check for WMV/ASF.

However, this adds complexity to our code. Native support in Magika would be a much cleaner and more efficient solution.

Additional context

Our use case is validating user-uploaded files on a server to ensure they are safe and match their expected types. While WMV is an older format, we still have a significant number of users who upload these files. Supporting this format would enhance Magika's utility as a comprehensive file identification tool.

Thank you for considering this request!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions