Page MenuHomePhabricator

[lib][native] Sanitize media upload filenames
ClosedPublic

Authored by bartek on Feb 1 2023, 8:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 6, 10:32 AM
Unknown Object (File)
Mon, Jan 6, 10:32 AM
Unknown Object (File)
Mon, Jan 6, 10:32 AM
Unknown Object (File)
Mon, Jan 6, 10:32 AM
Unknown Object (File)
Mon, Jan 6, 10:27 AM
Unknown Object (File)
Nov 16 2024, 9:35 PM
Unknown Object (File)
Nov 14 2024, 9:35 PM
Unknown Object (File)
Nov 14 2024, 8:50 PM
Subscribers

Details

Summary

Related: ENG-2562, ENG-2475

  • Created a function that replaces all invalid characters in filename with underscores. If the filename is not present, it is randomly generated and extension is deduced from MIME type.
  • Made the filename optional in MediaLibrarySelection type.
  • Used the above function to make sure that filename is valid when processed in processMedia().
Test Plan

On both platforms various different photos/videos including these containing spaces. Confirmed that all of them work and thumbnail generation doesn't crash.

Also manually tested the regex: https://regex101.com/r/1FCRbJ/2

Diff Detail

Repository
rCOMM Comm
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

bartek held this revision as a draft.
bartek published this revision for review.Feb 1 2023, 10:14 AM
atul requested changes to this revision.Feb 1 2023, 10:35 AM

So it appears that we're passing the filename down to processVideo which in turn passes it to getVideoProcessingPlan which is where we try to construct outputPaths for transcoded video + thumbnail and things appear to fail when we try to write to disk at the outputPaths? Wonder if we should push the sanitization down to where we're trying to construct a path so filename is "accurate" as it's being passed along to various functions?

I understand that the filename "doesn't really matter" because we're using uri to actually access the file, but I'm a little concerned about setting it to a value that doesn't reflect the actual name of the file? Like down the line maybe someone assumes that there's a file at ${uri.getParentDirectory}/${filename} but there's actually no file with that name?

Could we pass mime down to getVideoProcessingPlan and then call sanitizeFilename where we're constructing filenames?

Like turn

47f8f9.png (252×890 px, 44 KB)

into something like

eca362.png (300×1 px, 57 KB)


Definitely feel free to let me know if there's something I'm missing/misunderstanding here

This revision now requires changes to proceed.Feb 1 2023, 10:35 AM
In D6493#194479, @atul wrote:

So it appears that we're passing the filename down to processVideo which in turn passes it to getVideoProcessingPlan which is where we try to construct outputPaths for transcoded video + thumbnail and things appear to fail when we try to write to disk at the outputPaths?

Yes, that's my suspicion and it seems to be correct

Wonder if we should push the sanitization down to where we're trying to construct a path so filename is "accurate" as it's being passed along to various functions?

Don't have a strong opinion on that...

I understand that the filename "doesn't really matter" because we're using uri to actually access the file, but I'm a little concerned about setting it to a value that doesn't reflect the actual name of the file? Like down the line maybe someone assumes that there's a file at ${uri.getParentDirectory}/${filename} but there's actually no file with that name?

My main concern and the reason why I made these inputs optional is that on modern mobile OS we shouldn't assume that the filename is always available - due to security reasons, we usually receive a content URI with minimal knowledge about its origin. For instance, modern Android usually gives you a content:// URI with very few permissions, like content://com.android.providers.media.documents/document/image:56. The app can then query a content resolver for more info like filename etc, but it isn't guaranteed to be available. Currently, it usually is (e.g. through expo-media-library), but each android version increases privacy even more.
On iOS, expo-image-picker uses the suggestedName to get a filename, but it's a nullable type.
So in conclusion, we shouldn't rely on origin filenames too much.

Could we pass mime down to getVideoProcessingPlan and then call sanitizeFilename where we're constructing filenames?

Like turn

47f8f9.png (252×890 px, 44 KB)

into something like

eca362.png (300×1 px, 57 KB)

Sure, that shouldn't be a problem.

My main concern and the reason why I made these inputs optional is that on modern mobile OS we shouldn't assume that the filename is always available - due to security reasons, we usually receive a content URI with minimal knowledge about its origin. For instance, modern Android usually gives you a content:// URI with very few permissions, like content://com.android.providers.media.documents/document/image:56. The app can then query a content resolver for more info like filename etc, but it isn't guaranteed to be available. Currently, it usually is (e.g. through expo-media-library), but each android version increases privacy even more.
On iOS, expo-image-picker uses the suggestedName to get a filename, but it's a nullable type.
So in conclusion, we shouldn't rely on origin filenames too much.

Ahhh that makes sense, thanks for explaining and pointing out the specific Foundation API where filename is nullable

Sure, that shouldn't be a problem.

Awesome, thanks

Passed calling sanitizeFilename down to getVideoProcessingPlan

This revision is now accepted and ready to land.Feb 2 2023, 10:01 AM