[Assignment Tracking][5/*] Add core infrastructure for instruction reference

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Overview
It's possible to find intrinsics linked to an instruction by looking at the
MetadataAsValue uses of the attached DIAssignID. That covers instruction ->
intrinsic(s) lookup. Add a global DIAssignID -> instruction(s) map which gives
us the ability to perform intrinsic -> instruction(s) lookup. Add plumbing to
keep the map up to date through optimisations and add utility functions
including two that perform those lookups. Finally, add a unittest.

Details
In llvm/lib/IR/LLVMContextImpl.h add AssignmentIDToInstrs which maps DIAssignID
* attachments to Instruction *s. Because the DIAssignID * is the key we can't
use a TrackingMDNodeRef for it, and therefore cannot easily update the mapping
when a temporary DIAssignID is replaced.

Temporary DIAssignID's are only used in IR parsing to deal with metadata
forward references. Update llvm/lib/AsmParser/LLParser.cpp to avoid using
temporary DIAssignID's for attachments.

In llvm/lib/IR/Metadata.cpp add Instruction::updateDIAssignIDMapping which is
called to remove or add an entry (or both) to AssignmentIDToInstrs. Call this
from Instruction::setMetadata and add a call to setMetadata in Intruction's
dtor that explicitly unsets the DIAssignID so that the mappging gets updated.

In llvm/lib/IR/DebugInfo.cpp and DebugInfo.h add utility functions:

    getAssignmentInsts(const DbgAssignIntrinsic *DAI)
    getAssignmentMarkers(const Instruction *Inst)
    RAUW(DIAssignID *Old, DIAssignID *New)
    deleteAll(Function *F)

These core utils are tested in llvm/unittests/IR/DebugInfoTest.cpp.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D132224
This commit is contained in:
OCHyams 2022-11-07 11:56:36 +00:00
parent 2bf960aef0
commit 171f7024cc
11 changed files with 342 additions and 3 deletions

View File

@ -108,6 +108,12 @@ namespace llvm {
SmallVector<Instruction*, 64> InstsWithTBAATag;
/// DIAssignID metadata does not support temporary RAUW so we cannot use
/// the normal metadata forward reference resolution method. Instead,
/// non-temporary DIAssignID are attached to instructions (recorded here)
/// then replaced later.
DenseMap<MDNode *, SmallVector<Instruction *, 2>> TempDIAssignIDAttachments;
// Type resolution handling data structures. The location is set when we
// have processed a use of the type but not a definition yet.
StringMap<std::pair<Type*, LocTy> > NamedTypes;

View File

@ -21,7 +21,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/TinyPtrVector.h"
#include "llvm/ADT/iterator_range.h"
#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/IntrinsicInst.h"
namespace llvm {
@ -159,6 +159,67 @@ private:
SmallPtrSet<const MDNode *, 32> NodesSeen;
};
/// Assignment Tracking (at).
namespace at {
//
// Utilities for enumerating storing instructions from an assignment ID.
//
/// A range of instructions.
using AssignmentInstRange =
iterator_range<SmallVectorImpl<Instruction *>::iterator>;
/// Return a range of instructions (typically just one) that have \p ID
/// as an attachment.
/// Iterators invalidated by adding or removing DIAssignID metadata to/from any
/// instruction (including by deleting or cloning instructions).
AssignmentInstRange getAssignmentInsts(DIAssignID *ID);
/// Return a range of instructions (typically just one) that perform the
/// assignment that \p DAI encodes.
/// Iterators invalidated by adding or removing DIAssignID metadata to/from any
/// instruction (including by deleting or cloning instructions).
inline AssignmentInstRange getAssignmentInsts(const DbgAssignIntrinsic *DAI) {
return getAssignmentInsts(cast<DIAssignID>(DAI->getAssignID()));
}
//
// Utilities for enumerating llvm.dbg.assign intrinsic from an assignment ID.
//
/// High level: this is an iterator for llvm.dbg.assign intrinsics.
/// Implementation details: this is a wrapper around Value's User iterator that
/// dereferences to a DbgAssignIntrinsic ptr rather than a User ptr.
class DbgAssignIt
: public iterator_adaptor_base<DbgAssignIt, Value::user_iterator,
typename std::iterator_traits<
Value::user_iterator>::iterator_category,
DbgAssignIntrinsic *, std::ptrdiff_t,
DbgAssignIntrinsic **,
DbgAssignIntrinsic *&> {
public:
DbgAssignIt(Value::user_iterator It) : iterator_adaptor_base(It) {}
DbgAssignIntrinsic *operator*() const { return cast<DbgAssignIntrinsic>(*I); }
};
/// A range of llvm.dbg.assign intrinsics.
using AssignmentMarkerRange = iterator_range<DbgAssignIt>;
/// Return a range of dbg.assign intrinsics which use \ID as an operand.
/// Iterators invalidated by deleting an intrinsic contained in this range.
AssignmentMarkerRange getAssignmentMarkers(DIAssignID *ID);
/// Return a range of dbg.assign intrinsics for which \p Inst performs the
/// assignment they encode.
/// Iterators invalidated by deleting an intrinsic contained in this range.
inline AssignmentMarkerRange getAssignmentMarkers(const Instruction *Inst) {
if (auto *ID = Inst->getMetadata(LLVMContext::MD_DIAssignID))
return getAssignmentMarkers(cast<DIAssignID>(ID));
else
return make_range(Value::user_iterator(), Value::user_iterator());
}
/// Replace all uses (and attachments) of \p Old with \p New.
void RAUW(DIAssignID *Old, DIAssignID *New);
/// Remove all Assignment Tracking related intrinsics and metadata from \p F.
void deleteAll(Function *F);
} // end namespace at
/// Return true if assignment tracking is enabled.
bool getEnableAssignmentTracking();
} // end namespace llvm

View File

@ -515,6 +515,10 @@ private:
void
getAllMetadataImpl(SmallVectorImpl<std::pair<unsigned, MDNode *>> &) const;
/// Update the LLVMContext ID-to-Instruction(s) mapping. If \p ID is nullptr
/// then clear the mapping for this instruction.
void updateDIAssignIDMapping(DIAssignID *ID);
public:
//===--------------------------------------------------------------------===//
// Predicates and helper methods.

View File

@ -853,7 +853,18 @@ bool LLParser::parseStandaloneMetadata() {
// See if this was forward referenced, if so, handle it.
auto FI = ForwardRefMDNodes.find(MetadataID);
if (FI != ForwardRefMDNodes.end()) {
FI->second.first->replaceAllUsesWith(Init);
auto *ToReplace = FI->second.first.get();
// DIAssignID has its own special forward-reference "replacement" for
// attachments (the temporary attachments are never actually attached).
if (isa<DIAssignID>(Init)) {
for (auto *Inst : TempDIAssignIDAttachments[ToReplace]) {
assert(!Inst->getMetadata(LLVMContext::MD_DIAssignID) &&
"Inst unexpectedly already has DIAssignID attachment");
Inst->setMetadata(LLVMContext::MD_DIAssignID, Init);
}
}
ToReplace->replaceAllUsesWith(Init);
ForwardRefMDNodes.erase(FI);
assert(NumberedMetadata[MetadataID] == Init && "Tracking VH didn't work");
@ -2082,7 +2093,11 @@ bool LLParser::parseInstructionMetadata(Instruction &Inst) {
if (parseMetadataAttachment(MDK, N))
return true;
Inst.setMetadata(MDK, N);
if (MDK == LLVMContext::MD_DIAssignID)
TempDIAssignIDAttachments[N].push_back(&Inst);
else
Inst.setMetadata(MDK, N);
if (MDK == LLVMContext::MD_tbaa)
InstsWithTBAATag.push_back(&Inst);

View File

@ -12,6 +12,7 @@
//===----------------------------------------------------------------------===//
#include "llvm-c/DebugInfo.h"
#include "LLVMContextImpl.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"
@ -37,6 +38,7 @@
#include <utility>
using namespace llvm;
using namespace llvm::at;
using namespace llvm::dwarf;
static cl::opt<bool>
@ -1632,3 +1634,61 @@ LLVMMetadataKind LLVMGetMetadataKind(LLVMMetadataRef Metadata) {
return (LLVMMetadataKind)LLVMGenericDINodeMetadataKind;
}
}
AssignmentInstRange at::getAssignmentInsts(DIAssignID *ID) {
assert(ID && "Expected non-null ID");
LLVMContext &Ctx = ID->getContext();
auto &Map = Ctx.pImpl->AssignmentIDToInstrs;
auto MapIt = Map.find(ID);
if (MapIt == Map.end())
return make_range(nullptr, nullptr);
return make_range(MapIt->second.begin(), MapIt->second.end());
}
AssignmentMarkerRange at::getAssignmentMarkers(DIAssignID *ID) {
assert(ID && "Expected non-null ID");
LLVMContext &Ctx = ID->getContext();
auto *IDAsValue = MetadataAsValue::getIfExists(Ctx, ID);
// The ID is only used wrapped in MetadataAsValue(ID), so lets check that
// one of those already exists first.
if (!IDAsValue)
return make_range(Value::user_iterator(), Value::user_iterator());
return make_range(IDAsValue->user_begin(), IDAsValue->user_end());
}
void at::RAUW(DIAssignID *Old, DIAssignID *New) {
// Replace MetadataAsValue uses.
if (auto *OldIDAsValue =
MetadataAsValue::getIfExists(Old->getContext(), Old)) {
auto *NewIDAsValue = MetadataAsValue::get(Old->getContext(), New);
OldIDAsValue->replaceAllUsesWith(NewIDAsValue);
}
// Replace attachments.
AssignmentInstRange InstRange = getAssignmentInsts(Old);
// Use intermediate storage for the instruction ptrs because the
// getAssignmentInsts range iterators will be invalidated by adding and
// removing DIAssignID attachments.
SmallVector<Instruction *> InstVec(InstRange.begin(), InstRange.end());
for (auto *I : InstVec)
I->setMetadata(LLVMContext::MD_DIAssignID, New);
}
void at::deleteAll(Function *F) {
SmallVector<DbgAssignIntrinsic *, 12> ToDelete;
for (BasicBlock &BB : *F) {
for (Instruction &I : BB) {
if (auto *DAI = dyn_cast<DbgAssignIntrinsic>(&I))
ToDelete.push_back(DAI);
else
I.setMetadata(LLVMContext::MD_DIAssignID, nullptr);
}
}
for (auto *DAI : ToDelete)
DAI->eraseFromParent();
}

View File

@ -55,6 +55,10 @@ Instruction::~Instruction() {
// instructions in a BasicBlock are deleted).
if (isUsedByMetadata())
ValueAsMetadata::handleRAUW(this, UndefValue::get(getType()));
// Explicitly remove DIAssignID metadata to clear up ID -> Instruction(s)
// mapping in LLVMContext.
setMetadata(LLVMContext::MD_DIAssignID, nullptr);
}

View File

@ -1499,6 +1499,11 @@ public:
/// Collection of metadata used in this context.
DenseMap<const Value *, MDAttachments> ValueMetadata;
/// Map DIAssignID -> Instructions with that attachment.
/// Managed by Instruction via Instruction::updateDIAssignIDMapping.
/// Query using the at:: functions defined in DebugInfo.h.
DenseMap<DIAssignID *, SmallVector<Instruction *, 1>> AssignmentIDToInstrs;
/// Collection of per-GlobalObject sections used in this context.
DenseMap<const GlobalObject *, StringRef> GlobalObjectSections;

View File

@ -1425,6 +1425,37 @@ void Instruction::dropUnknownNonDebugMetadata(ArrayRef<unsigned> KnownIDs) {
}
}
void Instruction::updateDIAssignIDMapping(DIAssignID *ID) {
auto &IDToInstrs = getContext().pImpl->AssignmentIDToInstrs;
if (const DIAssignID *CurrentID =
cast_or_null<DIAssignID>(getMetadata(LLVMContext::MD_DIAssignID))) {
// Nothing to do if the ID isn't changing.
if (ID == CurrentID)
return;
// Unmap this instruction from its current ID.
auto InstrsIt = IDToInstrs.find(CurrentID);
assert(InstrsIt != IDToInstrs.end() &&
"Expect existing attachment to be mapped");
auto &InstVec = InstrsIt->second;
auto *InstIt = std::find(InstVec.begin(), InstVec.end(), this);
assert(InstIt != InstVec.end() &&
"Expect instruction to be mapped to attachment");
// The vector contains a ptr to this. If this is the only element in the
// vector, remove the ID:vector entry, otherwise just remove the
// instruction from the vector.
if (InstVec.size() == 1)
IDToInstrs.erase(InstrsIt);
else
InstVec.erase(InstIt);
}
// Map this instruction to the new ID.
if (ID)
IDToInstrs[ID].push_back(this);
}
void Instruction::setMetadata(unsigned KindID, MDNode *Node) {
if (!Node && !hasMetadata())
return;
@ -1435,6 +1466,16 @@ void Instruction::setMetadata(unsigned KindID, MDNode *Node) {
return;
}
// Update DIAssignID to Instruction(s) mapping.
if (KindID == LLVMContext::MD_DIAssignID) {
// The DIAssignID tracking infrastructure doesn't support RAUWing temporary
// nodes with DIAssignIDs. The cast_or_null below would also catch this, but
// having a dedicated assert helps make this obvious.
assert((!Node || !Node->isTemporary()) &&
"Temporary DIAssignIDs are invalid");
updateDIAssignIDMapping(cast_or_null<DIAssignID>(Node));
}
Value::setMetadata(KindID, Node);
}

View File

@ -68,6 +68,7 @@
#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/DerivedTypes.h"
@ -4548,6 +4549,10 @@ void Verifier::visitDIAssignIDMetadata(Instruction &I, MDNode *MD) {
CheckDI(isa<DbgAssignIntrinsic>(User),
"!DIAssignID should only be used by llvm.dbg.assign intrinsics",
MD, User);
// All of the dbg.assign intrinsics should be in the same function as I.
if (auto *DAI = dyn_cast<DbgAssignIntrinsic>(User))
CheckDI(DAI->getFunction() == I.getFunction(),
"dbg.assign not in same function as inst", DAI, &I);
}
}
}
@ -6008,6 +6013,10 @@ void Verifier::visitDbgIntrinsic(StringRef Kind, DbgVariableIntrinsic &DII) {
CheckDI(isa<DIExpression>(DAI->getRawAddressExpression()),
"invalid llvm.dbg.assign intrinsic address expression", &DII,
DAI->getRawAddressExpression());
// All of the linked instructions should be in the same function as DII.
for (Instruction *I : at::getAssignmentInsts(DAI))
CheckDI(DAI->getFunction() == I->getFunction(),
"inst not in same function as dbg.assign", I, DAI);
}
// Ignore broken !dbg attachments; they're checked elsewhere.

View File

@ -6,6 +6,13 @@
;;
;; Checks for this one are inline.
define dso_local void @fun2() !dbg !15 {
;; DIAssignID copied here from @fun() where it is used by intrinsics.
; CHECK: dbg.assign not in same function as inst
%x = alloca i32, align 4, !DIAssignID !14
ret void
}
define dso_local void @fun() !dbg !7 {
entry:
%a = alloca i32, align 4, !DIAssignID !14
@ -50,3 +57,4 @@ declare void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata,
!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!13 = !DILocation(line: 1, column: 1, scope: !7)
!14 = distinct !DIAssignID()
!15 = distinct !DISubprogram(name: "fun2", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)

View File

@ -368,4 +368,130 @@ TEST(DIBuilder, createDbgAddr) {
EXPECT_EQ(MDExp->getNumElements(), 0u);
}
TEST(AssignmentTrackingTest, Utils) {
// Test the assignment tracking utils defined in DebugInfo.h namespace at {}.
// This includes:
// getAssignmentInsts
// getAssignmentMarkers
// RAUW
// deleteAll
//
// The input IR includes two functions, fun1 and fun2. Both contain an alloca
// with a DIAssignID tag. fun1's alloca is linked to two llvm.dbg.assign
// intrinsics, one of which is for an inlined variable and appears before the
// alloca.
LLVMContext C;
std::unique_ptr<Module> M = parseIR(C, R"(
define dso_local void @fun1() !dbg !7 {
entry:
call void @llvm.dbg.assign(metadata i32 undef, metadata !10, metadata !DIExpression(), metadata !12, metadata i32 undef, metadata !DIExpression()), !dbg !13
%local = alloca i32, align 4, !DIAssignID !12
call void @llvm.dbg.assign(metadata i32 undef, metadata !16, metadata !DIExpression(), metadata !12, metadata i32 undef, metadata !DIExpression()), !dbg !15
ret void, !dbg !15
}
define dso_local void @fun2() !dbg !17 {
entry:
%local = alloca i32, align 4, !DIAssignID !20
call void @llvm.dbg.assign(metadata i32 undef, metadata !18, metadata !DIExpression(), metadata !20, metadata i32 undef, metadata !DIExpression()), !dbg !19
ret void, !dbg !19
}
declare void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata, metadata)
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5}
!llvm.ident = !{!6}
!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 14.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "test.c", directory: "/")
!2 = !{}
!3 = !{i32 7, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
!6 = !{!"clang version 14.0.0"}
!7 = distinct !DISubprogram(name: "fun1", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
!8 = !DISubroutineType(types: !9)
!9 = !{null}
!10 = !DILocalVariable(name: "local3", scope: !14, file: !1, line: 2, type: !11)
!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!12 = distinct !DIAssignID()
!13 = !DILocation(line: 5, column: 1, scope: !14, inlinedAt: !15)
!14 = distinct !DISubprogram(name: "inline", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
!15 = !DILocation(line: 3, column: 1, scope: !7)
!16 = !DILocalVariable(name: "local1", scope: !7, file: !1, line: 2, type: !11)
!17 = distinct !DISubprogram(name: "fun2", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
!18 = !DILocalVariable(name: "local2", scope: !17, file: !1, line: 2, type: !11)
!19 = !DILocation(line: 4, column: 1, scope: !17)
!20 = distinct !DIAssignID()
)");
// Check the test IR isn't malformed.
ASSERT_TRUE(M);
Function &Fun1 = *M->getFunction("fun1");
Instruction &Alloca = *Fun1.getEntryBlock().getFirstNonPHIOrDbg();
// 1. Check the Instruction <-> Intrinsic mappings work in fun1.
//
// Check there are two llvm.dbg.assign intrinsics linked to Alloca.
auto CheckFun1Mapping = [&Alloca]() {
auto Markers = at::getAssignmentMarkers(&Alloca);
EXPECT_TRUE(std::distance(Markers.begin(), Markers.end()) == 2);
// Check those two entries are distinct.
DbgAssignIntrinsic *First = *Markers.begin();
DbgAssignIntrinsic *Second = *std::next(Markers.begin());
EXPECT_NE(First, Second);
// Check that we can get back to Alloca from each llvm.dbg.assign.
for (auto *DAI : Markers) {
auto Insts = at::getAssignmentInsts(DAI);
// Check there is exactly one instruction linked to each intrinsic. Use
// ASSERT_TRUE because we're going to dereference the begin iterator.
ASSERT_TRUE(std::distance(Insts.begin(), Insts.end()) == 1);
EXPECT_FALSE(Insts.empty());
// Check the linked instruction is Alloca.
Instruction *LinkedInst = *Insts.begin();
EXPECT_EQ(LinkedInst, &Alloca);
}
};
CheckFun1Mapping();
// 2. Check DIAssignID RAUW replaces attachments and uses.
//
DIAssignID *Old =
cast_or_null<DIAssignID>(Alloca.getMetadata(LLVMContext::MD_DIAssignID));
DIAssignID *New = DIAssignID::getDistinct(C);
ASSERT_TRUE(Old && New && New != Old);
at::RAUW(Old, New);
// Check fun1's alloca and intrinsics have been updated and the mapping still
// works.
EXPECT_EQ(New, cast_or_null<DIAssignID>(
Alloca.getMetadata(LLVMContext::MD_DIAssignID)));
CheckFun1Mapping();
// Check that fun2's alloca and intrinsic have not not been updated.
Instruction &Fun2Alloca =
*M->getFunction("fun2")->getEntryBlock().getFirstNonPHIOrDbg();
DIAssignID *Fun2ID = cast_or_null<DIAssignID>(
Fun2Alloca.getMetadata(LLVMContext::MD_DIAssignID));
EXPECT_NE(New, Fun2ID);
auto Fun2Markers = at::getAssignmentMarkers(&Fun2Alloca);
ASSERT_TRUE(std::distance(Fun2Markers.begin(), Fun2Markers.end()) == 1);
auto Fun2Insts = at::getAssignmentInsts(*Fun2Markers.begin());
ASSERT_TRUE(std::distance(Fun2Insts.begin(), Fun2Insts.end()) == 1);
EXPECT_EQ(*Fun2Insts.begin(), &Fun2Alloca);
// 3. Check that deleting works and applies only to the target function.
at::deleteAll(&Fun1);
// There should now only be the alloca and ret in fun1.
EXPECT_EQ(Fun1.begin()->size(), 2);
// fun2's alloca should have the same DIAssignID and remain linked to its
// llvm.dbg.assign.
EXPECT_EQ(Fun2ID, cast_or_null<DIAssignID>(
Fun2Alloca.getMetadata(LLVMContext::MD_DIAssignID)));
EXPECT_FALSE(at::getAssignmentMarkers(&Fun2Alloca).empty());
}
} // end namespace