-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(filename): handle files beginning with a period or without extensions #185
base: main
Are you sure you want to change the base?
Conversation
@@ -57,10 +57,20 @@ export function resolveAlias(path: string, aliases: Record<string, string>) { | |||
return _path; | |||
} | |||
|
|||
const FILENAME_RE = /(^|[/\\])([^/\\]+?)(?=(\.[^.]+)?$)/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can improve regex instead to keep performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it could be two checks, still keeping finalname logic independent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I just ran some quick tests and it does seem slower. The bulk of the time comes from basename
so maybe that can be swapped out.
This implementation seems to have a slight edge on the regex implementation when running on node 20.
export function filename(path: string) {
const base = path.split(/[/\\]/).pop();
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems even better. what is edge case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mean the results are very close. This approach is faster for 10 of the test strings and slower for 6 of them, but the difference is typically < 15%.
Here's the quick test I ran:
import { group, bench, run } from "mitata";
import { basename } from "pathe";
const paths = Object.keys({
// POSIX
"test.html": "test",
"/temp/myfile.html": "myfile",
"./myfile.html": "myfile",
"/Users/john.doe/foo/myFile.js": "myFile",
"/Users/john.doe/foo/myFile": "myFile",
"./.hidden/myFile.ts": "myFile",
"./.hidden/myFile": "myFile",
"/temp/.gitignore": ".gitignore",
// Windows
"C:\\temp\\": undefined,
"C:\\temp\\myfile.html": "myfile",
"\\temp\\myfile.html": "myfile",
".\\myfile.html": "myfile",
".\\john.doe\\myfile.js": "myfile",
".\\john.doe\\myfile": "myfile",
".\\.hidden\\myfile.js": "myfile",
".\\.hidden\\myfile": "myfile",
"C:\\temp\\.gitignore": ".gitignore",
})
const FILENAME_RE = /(^|[/\\])([^/\\]+?)(?=(\.[^.]+)?$)/;
const regex = function filename(path) {
return path.match(FILENAME_RE)?.[2];
}
const noRegex = function filename(path) {
const base = basename(path);
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
const mixed = function filename(path) {
const base = path.split(/[/\\]/).pop()
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
const SLASH_RE=/[/\\]/
const lastIndex = function filename(path) {
let slashIndex = -1;
let len = path.length
for (let i = len - 1; i >= 0; i--) {
if (SLASH_RE.test(path.charAt(i))) {
slashIndex = i;
break;
}
}
const base = slashIndex === -1 ? path : path.slice(slashIndex)
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
paths.forEach((fixture) => {
group(fixture, () => {
Object.entries({
regex,
noRegex,
basename,
mixed,
lastIndex
}).forEach(([name, fn]) => {
bench(name + ": " + fixture, () => {
fn(fixture);
});
});
});
});
run();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, bun seems to do much better with the iteration approach in lastIndex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might make it faster but avoid split/pop and instead look(backward) for the last segment separator in a single for of
loop. In the same iteration, you can spot the first(last) dot position. (two chars with one stone haha)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That does seem to help, but it's still behind the other approaches. I added one more approach to use basename
, but with the path splitting done with a regex instead of normalizeWindowsPath(path).split('/')
.
It performs nearly identically to the mixed
function, but if the regex splitting can be used in the main basename
function I think that would be my preference as you automatically limit possible inconsistencies between filename
and basename
. (For example, fixes like this would apply to both #180)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current slate of options:
import { group, bench, run } from "mitata";
import { basename } from "pathe";
const paths = Object.keys({
// POSIX
"test.html": "test",
"/temp/myfile.html": "myfile",
"./myfile.html": "myfile",
"/Users/john.doe/foo/myFile.js": "myFile",
"/Users/john.doe/foo/myFile": "myFile",
"./.hidden/myFile.ts": "myFile",
"./.hidden/myFile": "myFile",
"/temp/.gitignore": ".gitignore",
// Windows
"C:\\temp\\": undefined,
"C:\\temp\\myfile.html": "myfile",
"\\temp\\myfile.html": "myfile",
".\\myfile.html": "myfile",
".\\john.doe\\myfile.js": "myfile",
".\\john.doe\\myfile": "myfile",
".\\.hidden\\myfile.js": "myfile",
".\\.hidden\\myfile": "myfile",
"C:\\temp\\.gitignore": ".gitignore",
})
const FILENAME_RE = /(^|[/\\])([^/\\]+?)(?=(\.[^.]+)?$)/;
const regex = function filename(path) {
return path.match(FILENAME_RE)?.[2];
}
const noRegex = function filename(path) {
const base = basename(path);
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
const SLASH_RE = /[/\\]/
const mixed = function filename(path) {
const base = path.split(SLASH_RE).pop()
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
const lastIndex = function filename(path) {
let slashIndex = -1;
let len = path.length
for (let i = len - 1; i >= 0; i--) {
if (SLASH_RE.test(path.charAt(i))) {
slashIndex = i;
break;
}
}
const base = slashIndex === -1 ? path : path.slice(slashIndex)
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
const lastIndexOneLoop = function filename(path) {
let slashIndex = -1;
let dotIndex = -1;
let len = path.length
for (let i = len - 1; i >= 0; i--) {
const char = path.charAt(i);
if (SLASH_RE.test(char)) {
slashIndex = i;
break;
}
if (dotIndex === -1 && char === '.') {
dotIndex = i
}
}
const base = slashIndex === -1 ? path : path.slice(slashIndex)
if (!base) {
return undefined;
}
if (dotIndex <= 0) {
return base;
}
return base.slice(0, dotIndex);
}
// Based on https://github.com/unjs/pathe/pull/180/files
// but using a regex to split the path
const basenameRegex = function (p, extension) {
const segments = p.split(SLASH_RE);
// default to empty string
let lastSegment = "";
for (let i = segments.length - 1; i >= 0; i--) {
const val = segments[i];
if (val) {
lastSegment = val;
break;
}
}
return extension && lastSegment.endsWith(extension)
? lastSegment.slice(0, -extension.length)
: lastSegment;
};
const withAlternativeBasename = function filename(path) {
const base = basenameRegex(path)
if (!base) {
return undefined;
}
const separatorIndex = base.lastIndexOf(".");
if (separatorIndex <= 0) {
return base;
}
return base.slice(0, separatorIndex);
}
paths.forEach((fixture) => {
group(fixture, () => {
Object.entries({
regex,
noRegex,
// basename,
mixed,
lastIndex,
lastIndexOneLoop,
withAlternativeBasename
}).forEach(([name, fn]) => {
bench(name + ": " + fixture, () => {
fn(fixture);
});
});
});
});
run();
50fec47
to
a576974
Compare
(#184)
This change resolves the issue reported with files containing no extension. It also adds tests for filenames such as
.gitignore
or.env
which begin with a period.