ROSE  0.11.145.0
Working with attributes

Attaching user-defined attributes to objects.

Many ROSE classes allow users to define and store their own data in the form of attributes. An attribute is a name/value pair where the name uniquely identifies the attribute within the container object and the value has a user-defined type.

ROSE supports three interfaces for attributes:

Although there are three interfaces, they really all share the same basic mechanism. SgNode attributes are implemented in terms of the AstAttributeMechanism, which is implemented in terms of Sawyer::Attribute.

Comparison of attribute interfaces

IR node attributes AstAttributeMechanism Sawyer::Attribute
Applies only to IR nodes. Class authors can add attribute-storing capability to any class by containing an AstAttributeMechanism object. Class authors can add attribute-storing capability to any class by inheriting this interface.
Can store multiple attributes with many different value types as long as those types all derive from AstAttribute. Can store multiple attributes with many different value types as long as those types all derive from AstAttribute. Can store multiple attributes with many different value types.
Requires non-class values to be wrapped in a class derived from AstAttribute Requires non-class values to be wrapped in a class derived from AstAttribute Can directly store non-class values.
User must be able modify the value type so it inherits from AstAttribute, or he must wrap the type in his own subclass of AstAttribute, adding an extra level of indirection to access the value. User must be able modify the value type so it inherits from AstAttribute, or he must wrap the type in his own subclass of AstAttribute, adding an extra level of indirection to access the value. Can store values whose type is not user-modifiable, such as STL containers.
No assurance that the same name is not used for two different purposes. No assurance that the same name is not used for two different purposes. Ensures that two users don't declare the same attribute name.
Requires implementation of virtual copy method (non-pure) if copying is intended. Requires implementation of virtual copy method (non-pure) if copying is intended. Uses normal C++ copy constructors and assignment operators for attribute values.
Errors are not reported. Errors are reported by return values. Errors are reported by dedicated exception types.
Attempting to retrieve a non-existing attribute without providing a default value returns a null attribute pointer Attempting to retrieve a non-existing attribute without providing a default value returns a null attribute pointer Attempting to retrieve a non-existing attribute without providing a default value throws a Sawyer::Attribute::DoesNotExist exception.
Attribute values types are runtime checked. A mismatch is discovered by the user when they perform a dynamic_cast from the AstAttribute base type to their subclass. Attribute values types are runtime checked. A mismatch is discovered by the user when they perform a dynamic_cast from the AstAttribute base type to their subclass. Attribute value types are runtime checked. A mismatch between writing and reading is reported by a Sawyer::Attribute::WrongQueryType exception.
Requires user to use C++ dynamic_cast from the AstAttribute pointer to the user's subclass pointer. Requires user to use C++ dynamic_cast from the AstAttribute pointer to the user's subclass pointer. All casting is hidden behind the API.

Some examples may help illuminate the differences. The examples show three methods of using attributes:

Let us assume that two types exist in some library header file somewhere and the user wants to store these as attribute values in some object. The two value types are:

// Declared in a 3rd-party library
enum Approximation { UNDER_APPROXIMATED, OVER_APPROXIMATED, UNKNOWN_APPROXIMATION };
// Declared in a 3rd-party library
struct AnalysisTime {
double cpuTime;
double elapsedTime;
AnalysisTime()
: cpuTime(0.0), elapsedTime(0.0) {}
AnalysisTime(double cpuTime, double elapsedTime)
: cpuTime(cpuTime), elapsedTime(elapsedTime) {}
};

Let us also assume that a ROSE developer has a class and wants the user to be able to store attributes in objects of that class. The first step is for the ROSE developer to prepare his class for storing attributes:

// Method 1: Sawyer::Attribute
class ObjectWithAttributes_1: public Sawyer::Attribute::Storage<> {
// other members here...
};
// Method 2: AstAttributeMechanism
class ObjectWithAttributes_2 {
public:
AstAttributeMechanism attributeMechanism;
// other members here...
};
// Method 3: Attributes in IR nodes (class derivation is not demoed here)
typedef SgAsmInstruction ObjectWithAttributes_3;

Method 1 is designed to use inheritance: all of its methods have the word "attribute" in their names. Method 2 could be used by inheritance, but is more commonly used with containment due to its short, common method names like size. Method 3 applies only to Sage IR nodes, but creating a new subclass of SgNode is outside the scope of this document; instead, we'll just use an existing IR node type.

Now we jump into the user code. The user wants to be able to store two attributes, one of each value type. As mentioned above, the attribute value types are defined in some library header, and the class of objects in which to store them is defined in a ROSE header file. Method 1 an store values of any type, but the user has more work to do before he can use methods 2 or 3:

// Method 2: AstAttributeMechanism needs wrappers with "copy" methods.
class ApproximationAttribute_2: public AstAttribute,
public AllocationCounter<ApproximationAttribute_2> // ignore this, it's only for testing the implementation
{
public:
Approximation approximation;
explicit ApproximationAttribute_2(Approximation a)
: approximation(a) {}
virtual AstAttribute* copy() const override {
return new ApproximationAttribute_2(*this);
}
virtual std::string attribute_class_name() const override {
return "ApproximationAttribute_2";
}
};
class AnalysisTimeAttribute_2: public AstAttribute,
public AllocationCounter<AnalysisTimeAttribute_2> // ignore this, it's only for testing the implementation
{
public:
AnalysisTime analysisTime;
explicit AnalysisTimeAttribute_2(const AnalysisTime &t)
: analysisTime(t) {}
virtual AstAttribute* copy() const override {
return new AnalysisTimeAttribute_2(*this);
}
virtual std::string attribute_class_name() const override {
return "AnalysisTimeAttribute_2";
}
};
// Method 3: IR node attributes need to be wrapped
class ApproximationAttribute_3: public AstAttribute,
public AllocationCounter<ApproximationAttribute_3> // ignore this, it's only for testing the implementation
{
public:
Approximation approximation;
explicit ApproximationAttribute_3(Approximation a)
: approximation(a) {}
virtual AstAttribute* copy() const override {
return new ApproximationAttribute_3(*this);
}
virtual std::string attribute_class_name() const override {
return "ApproximationAttribute_3";
}
};
class AnalysisTimeAttribute_3: public AstAttribute,
public AllocationCounter<AnalysisTimeAttribute_3> // ignore this, it's only for testing the implementation
{
public:
AnalysisTime analysisTime;
explicit AnalysisTimeAttribute_3(const AnalysisTime &t)
: analysisTime(t) {}
virtual AstAttribute* copy() const override {
return new AnalysisTimeAttribute_3(*this);
}
virtual std::string attribute_class_name() const override {
return "AllocationTimeAttribute_3";
}
};

Method 1 requires no additional wrapper code since it can store any value directly. Methods 2 and 3 both require a substantial amount of boilerplate to store even a simple enum value. The copy method's purpose is to allocate a new copy of an attribute when the object holding the attribute is copied or assigned. The copy method should be implemented in every AstAttribute subclass, although few do. If it's not implemented then one of two things happen: either the attribute is not copied, or only a superclass of the attribute is copied. Subclasses must also implement attribute_class_name, although few do. Neither copy nor attribute_class_name are pure virtual because of limitations with ROSETTA code generation.

Next, the user will want to use descriptive strings for the attribute so error messages are informative, but shorter names in C++ code, so we declare the attribute names:

// Method 1: Sawyer::Attribute
const Sawyer::Attribute::Id APPROXIMATION_ATTR = Sawyer::Attribute::declare("type of approximation performed");
const Sawyer::Attribute::Id ANALYSIS_TIME_ATTR = Sawyer::Attribute::declare("time taken for the analysis");
// Method 2: AstAttributeMechanism
const std::string APPROXIMATION_ATTR = "type of approximation performed";
const std::string ANALYSIS_TIME_ATTR = "time taken for the analysis";
// Method 3: Attributes in IR nodes
const std::string APPROXIMATION_ATTR = "type of approximation performed";
const std::string ANALYSIS_TIME_ATTR = "time taken for the analysis";

The declarations in methods 2 and 3 are identical. Method 1 differs by using an integral type for attribute IDs, which has two benefits: (1) it prevents two users from using the same attribute name for different purposes, and (2) it reduces the size and increases the speed of the underlying storage maps by storing integer keys rather than strings. Method 1 has functions that convert between identification numbers and strings if necessary (e.g., error messages).

Now, let us see how to insert two attributes into an object assuming that the object came from somewhere far away and we don't know whether it already contains these attributes. If it does, we want to overwrite their old values with new values. Overwriting values is likely to be a more common operation than insert-if-nonexistent. After all, languages generally don't have a dedicated assign-value-if-none-assigned operator (Perl and Bash being exceptions).

// Method 1: Sawyer::Attribute
obj_1.setAttribute(APPROXIMATION_ATTR, UNDER_APPROXIMATED);
obj_1.setAttribute(ANALYSIS_TIME_ATTR, AnalysisTime(1.0, 2.0));
// Method 2: AstAttributeMechanism
obj_1.attributeMechanism.set(APPROXIMATION_ATTR, new ApproximationAttribute_2(UNDER_APPROXIMATED));
obj_1.attributeMechanism.set(ANALYSIS_TIME_ATTR, new AnalysisTimeAttribute_2(AnalysisTime(1.0, 2.0)));
// Method 3: Attributes in IR nodes
obj_1->setAttribute(APPROXIMATION_ATTR, new ApproximationAttribute_3(UNDER_APPROXIMATED));
obj_1->setAttribute(ANALYSIS_TIME_ATTR, new AnalysisTimeAttribute_3(AnalysisTime(1.0, 2.0)));

Method 1 stores the attribute directly while Methods 2 and 3 require the attribute value to be wrapped in a heap-allocated object first.

Eventually the user will want to retrieve an attribute's value. Users commonly need to obtain the attribute or a default value.

// Method 1: Sawyer::Attribute
Approximation approx_1 = obj_1.attributeOrElse(APPROXIMATION_ATTR, UNKNOWN_APPROXIMATION);
double cpuTime_1 = obj_1.attributeOrDefault<AnalysisTime>(ANALYSIS_TIME_ATTR).cpuTime;
// Method 2: AstAttributeMechanism
Approximation approx_1 = UNKNOWN_APPROXIMATION;
if (ApproximationAttribute_2 *tmp = dynamic_cast<ApproximationAttribute_2*>(obj_1.attributeMechanism[APPROXIMATION_ATTR]))
approx_1 = tmp->approximation;
double cpuTime_1 = AnalysisTime().cpuTime; // the default, assuming we don't want to hard-code it.
if (AnalysisTimeAttribute_2 *tmp = dynamic_cast<AnalysisTimeAttribute_2*>(obj_1.attributeMechanism[ANALYSIS_TIME_ATTR]))
cpuTime_1 = tmp->analysisTime.cpuTime;
// Method 3: Attributes in IR nodes
Approximation approx_1 = UNKNOWN_APPROXIMATION;
if (ApproximationAttribute_3 *tmp = dynamic_cast<ApproximationAttribute_3*>(obj_1->getAttribute(APPROXIMATION_ATTR)))
approx_1 = tmp->approximation;
double cpuTime_1 = AnalysisTime().cpuTime; // the default, assuming we don't want to hard-code it.
if (AnalysisTimeAttribute_3 *tmp = dynamic_cast<AnalysisTimeAttribute_3*>(obj_1->getAttribute(ANALYSIS_TIME_ATTR)))
cpuTime_1 = tmp->analysisTime.cpuTime;

Method 1 has a couple functions dedicated to this common scenario. Methods 2 and 3 return a null pointer if the attribute doesn't exist, but require a dynamic cast to the appropriate type otherwise.

Sooner or later a user will want to erase an attribute. Perhaps the attribute holds the result of some optional analysis which is no longer valid. The user wants to ensure that the attribute doesn't exist, but isn't sure whether it currently exists:

// Method 1: Sawyer::Attribute
obj_1.eraseAttribute(APPROXIMATION_ATTR);
obj_2.eraseAttribute(ANALYSIS_TIME_ATTR);
// Method 2: AstAttributeMechanism
obj_1.attributeMechanism.remove(APPROXIMATION_ATTR);
obj_2.attributeMechanism.remove(ANALYSIS_TIME_ATTR);
// Method 3: Attributes in IR nodes
obj_1->removeAttribute(APPROXIMATION_ATTR);
obj_2->removeAttribute(ANALYSIS_TIME_ATTR);

If the attribute didn't exist then none of these methods do anything. If it did exist... With Method 1, the value's destructor is called. Methods 2 and 3 delete the heap-allocated value, which is allowed since the attribute container owns the object.

Finally, when the object containing the attributes is destroyed the user needs to be able to clean up by destroying the attributes that are attached:

// Method 1: Sawyer::Attribute: value destructors called automatically
// if containing object is destroyed by exception unwinding.
int x = something_that_might_throw();
// Method 2: AstAttributeMechanism: attributes automatically deleted
// if the containing object would be destroyed by exception unwinding.
int x = something_that_might_throw();
// Method 3: Attributes in IR nodes: attributes automatically destroyed if
// an IR node is deleted. IR nodes are seldome destroyed during exception unwinding.
int x = something_that_might_throw();

All three interfaces now properly clean up their attributes, although this wasn't always the case with methods 2 and 3.

Collaboration diagram for Working with attributes: