-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit symbol_map to only nonzero-sized symbols #447
Conversation
A snippet from
|
Hmm, should this STT_NOTYPE actually be here? https://github.com/gimli-rs/object/blob/master/src/read/elf/symbol.rs#L476 It looks like the backtrace crate limits itself to STT_FUNC and STT_OBJECT: https://github.com/rust-lang/backtrace-rs/blob/master/src/symbolize/gimli/elf.rs#L113 |
I've seen |
Alternatively, the logic could try to pick the "best" name for a given address. Beyond the weird
There could be some logic that e.g. prefers |
Examples of
A more dubious one from a vmlinux for aarch64:
My recollection is that these are generated by something other than gcc/llvm (e.g. maybe GNU as).
This sounds desirable, although the way the code is written doesn't make this easy to do.
As an aside, we don't currently include versions in the map, but maybe that would be useful. |
It appears that the version is directly embedded in the symbol name, so I'd expect the map to return the versioned name for roughly half the symbols depending on how the unstable sort behaved. |
Updated to filter out zero sized symbols instead. |
It looks like all
|
I think that Some of the symbols appear to be incorrectly specified as zero size. For example, So I think it'll be a case of trying the best we can, rather than expecting it to be perfect. How about only doing the zero size check for |
@sfackler Would this work for you? Or do you have another suggestion? |
This is the logic I ended up using, which also incorporates that heuristic to pick the "best" symbol: fn symbol_map(file: &File<'a, &'a [u8]>) -> SymbolMap<SymbolMapName<'a>> {
let mut map = BTreeMap::<u64, PotentialName<'_>>::new();
let symbols = file
.symbol_table()
// https://github.com/gimli-rs/object/pull/443
.filter(|t| t.symbols().next().is_some())
.or_else(|| file.dynamic_symbol_table());
// Binaries commonly have multiple symbols for a given address, so we put a bit of effort in here to pick the
// "best" one. In particular, we limit to function and data object symbols, and prefer globally visible symbols
// over private symbols.
for symbol in symbols.into_iter().flat_map(|s| s.symbols()) {
if !symbol.is_definition() {
continue;
}
if symbol.kind() == SymbolKind::Unknown {
continue;
}
let name = match symbol.name() {
Ok(name) => name,
Err(_) => continue,
};
let name = PotentialName {
name: SymbolMapName::new(symbol.address(), name),
global: symbol.is_global(),
};
match map.entry(symbol.address()) {
Entry::Occupied(mut e) => {
if name.global && !e.get().global {
e.insert(name);
}
}
Entry::Vacant(e) => {
e.insert(name);
}
}
}
SymbolMap::new(map.into_values().map(|n| n.name).collect())
}
} |
The documentation for symbol_map claims that this filtering should be happening, but it doesn't appear that it was. In particular, I'm seeing that code produced by rustc has symbols with the
Unknown
kind named like$x.224
defined at the same address as many functions, and depending on the ordering can be returned bySymbolMap::get
rather than the "real" function name.