Filed under: Uncategorized

How do you operationalize a measurement anyways?

byX on Saturday, January 29, 2011 at 2:15pm

First of all, you have to define your output. It can be a scalar, a vector, or a tensor.

Of course, you can simply determine which criteria are relevant and output these criteria as yet another vector of relevant criteria. That preserves information, and is the most useful metric for people who are attuned to comparing the specific facets of this criteria.

Or you can apply a metric (a distance metric such as the 2-norm – but it can also be a p-norm or a Hamming Distance or whatever) and output a score as a scalar value.

And you can assign different weights to each criteria (there are many different ways to weight, not all as simple as the case where the coefficients of the weights all sum up to 1). This, of course, presupposes that the criteria take the form of scalar quantities. Sometimes, the relevant criteria is more complex than that.

And like what the NRC did with its grad program rankings, you can produce multiple outputs (R and S criterion). Actually the NRC did something more complicated than that. For the R criteria, it asked professors which schools are most prestigious. Then it looked at the criteria characteristic of more prestigious schools, and ranked schools according to which ones had the “highest amount” of the criteria that were most prevalent in the more prestigious schools. Then for the S criteria, it simply asked professors which criteria were most important, and ranked schools according to how much they contained desirable criteria. And of course, users can rank schools in any order they want to simply by assigning weights to each value (although I think this is the simple case where all coefficients sum up to 1). In terms of validity, I’d trust the S metric over the R metric since it is less suspect to cognitive biases. [1]

This, of course, is what’s already done and well-known. There are other ways of operationalizing output too. What’s often neglected – are the effects of the 2nd-order interactions. Maybe there’s a way to use 2nd-order interactions to operationalize output (maybe there are ways that they are already done).

Of course, even 2nd-order interactions are not enough. 2nd-order interactions tend to be commutative (in order words, order does not matter). However, there are situations where the order of the 2nd (and higher) order interactions does matter.

And then, of course, you also have geometric relationships to consider. Geometric relationships may be as simple as taking the “distance” between two criteria (in order words, a scalar value that’s the output of some metric). Or they could be more complex.

And another relationship too: probabilities. Every measurement is uncertain and these uncertainties may also have to be included in our operationalization.

Also, weights are not simply scalars. Weights can also be functions. They can be functions of time, or functions of the particular person, or whatever. These multidimensional weights must still sum to 1, so when the weight of one goes up, the weight of the others goes down.

Also, even scalar outputs are not necessarily scalars. Rather, they can be outputs of probability distributions (again, I was quite impressed with the uncertainty range/confidence interval outputs of the NRC rankings)

Maybe studying ways to *operationalize* things is already a domain of intense interdisciplinary interest.

==

Anyways, some examples:

The DSM-IV is fundamentally flawed. Of course, every

operationalization is flawed – some are still good enough to do what they do despite being flawed. And why is that? Because in order to get diagnosed with some condition, you must only fulfill at least X of Y different criteria. There’s absolutely nothing about the magnitude of each symptom, or the environmental-dependence of each symptom, or the interactions each symptom can have with each other.

Furthermore, each operationalization (or definition, really) should take in parameters to specify potential environmental-dependence (the operationalization may have VERY LITTLE change when you vary the parameters, OR it could change SIGNIFICANTLY when you vary the parameters). I believe that people are currently systematically underestimating the environmental-dependence of many of their definitions/operationalizations. You can also call these parameters “relevance parameters”, since they depend on the person who’s processing them.

[1] This dichotomy is actually VERY useful for other fields too. For example – rating which forumers are the funniest in a forum

==

Skepticism regarding coarse mechanisms

by X on Saturday, January 29, 2011 at 1:46pm

I like trying to elucidate neurobiological/cognitive/behavioral mechanisms at a finer structure. Coarse mechanisms are often less robust (we don’t know how they really interact, and in the presence of a DIFFERENT environment, then the preconditions of these mechanisms may uncouple – we may not even know the preconditions of these mechanisms). The reason is this: necessary conditions are inclusive of BOTH preconditions and the GEOMETRY+combinatorial arrangements of these preconditions – where these preconditions are with respect to each other. This is a good reason to be skeptical of the utility of measuring things that we only measure coarsely, such as IQ, since we still don’t know the GEOMETRY+combinatorial arrangements of the preconditions of high IQ (that being said, I still do believe most of the correlations in this particular environment)

In other words, having all the preconditions is not sufficient to explain the mechanism. You have to have the preconditions *arranged* in the *right* way – both geometrically and combinatorially (inclusive of order effects, where the preconditions have to be

temporally/spatially arranged in the right way – not just withe each other – but also with the rest of the environment).

==

If the mechanism preserves itself even *after* we have updated information of its finer structure, then yes, we can update our posterior probability of the mechanism applying.

==

Now, how is research in the finer mechanisms done? Is it more likely than others to be Kuhnian normal science or Kuhnian revolutionary science? Studying combinatorial/geometric interactions can be *very* analytically (and computationally) intensive.

What’s also interesting: finer mechanisms tend to be more general. Even though finer = smaller scale. But it often takes large numbers of measurements before someone has enough data to elucidate the finer mechanism.

Filed under: Uncategorized

Some of my best threads on Physics Forums

wikipedia: http://en.wikipedia.org/wiki/User:Simfish

books I’ve read: http://books.google.com/books?uid=13718804063927505694

journal articles i’ve read: http://www.citeulike.org/profile/InquilineKea

google profile: http://www.google.com/profiles/simfish

Google reader shared items: http://www.google.com/reader/shared/10516082170111880850

Science news: (URLs don’t really matter anymore, just google the titles)

Favorite blogs @ http://inquilinekea.blogspot.com

Filed under: Uncategorized

Also posted here: http://lesswrong.com/r/discussion/lw/3ol/is_there_a_way_to_quantify_the_relationship/

Okay, so maybe you could say this.

Suppose you have an index I. I could be a list of items in belief-space (or a person’s map). So I could have these items (believes in evolution, believes in free will, believes that he will get energy from eating food, etc..) Of course, in order to make this argument more rigorous, we must make the beliefs finer.

For now, we can assume the non-existence of a priori knowledge. In other words, facts they may not explicitly know, but would explicitly deduce simply by using the knowledge they already have.

Now, maybe Person1 has a map in j-space with values of (0,0,0.2,0.5,0,1,…), corresponding to the degree of his belief in items in index I. So the first value of 0 corresponds to his total disbelief in evolution, the second corresponds to total disbelief in free will, and so on.

Person2 has a map in k-space with values of (0,0,0.2,0.5,0,0.8, NaN, 0, 1, …), corresponding to the degree of his belief in everything in the world. Now, I include a value of NaN in his map, because the NaN could correspond to an item in index I that he has never encountered. Maybe there’s a way to quantify NaN, which might make it possible for Person1 and Person2 to both have maps in the same n-space (which might make it more possible to compare their mutual information using traditional math methods).

Furthermore, Person1’s map is a function of time, as is Person2’s map. Their maps evolve over time since they learn new information, change their beliefs, and forget information. Person1’s map can expand from j-space to (j+n)th space, as he forms new beliefs on new items. Once you apply a distance metric to their beliefs, you might be able to map them on a grid, to compare their beliefs with each other. A distance metric with a scalar value, for example, would map their beliefs to a 1D axis (this is what political tests often do). A distance metric can also output a vector value (much like what a MBTI personality test could do) to a value in j-space. If you simply took the difference between the two maps, you cold also output a vector value that could be mapped to a space whose dimension is equal to the dimension of the original map (assuming that the two maps have the same dimension, of course).

Anyways, here is my question: Is there a better way to quantify this? Has anyone else thought of this? Of course, we could use a distance metric to compare their distances with respect to each other (of course, a Euclidean metric could be used if they have maps in the same n-space.

==

As an alternative question, are there metrics that could compare the distance between a map in j-space with a map in k-space (even if j is not equal to k)? I know that you have p-norms that correspond to some absolute scalar value when you apply the p-norms to a matrix. But this is sort of difference. And could mutual information be considered a metric?

Filed under: Uncategorized

# The "map" and "territory" analogy as it pertains to potentially novel territories that people may not anticipate

0InquilineKea07 January 2011 09:19AM

So in terms of the "map" and "territory" analogy, the goal of rationality is to make our map correspond more closely with the territory. This comes in two forms – (a) area and (b) accuracy. Person A could have a larger map than person B, even if A’s map might be less accurate than B’s map. There are ways to increase the area of the territory – often by testing things in the boundary value conditions of the territory. I often like asking boundary value/possibility space questions like "well, what might happen to the atmosphere of a rogue planet as time approaches infinity?", since I feel like they might give us additional insight about the robustness of planetary atmosphere models across different environments (and also, the possibility that I might be wrong makes me more motivated to actually spend additional effort to test/calibrate my model more than I otherwise would test/calibrate it). My intense curiosity with these highly theoretical questions often puzzles the experts in the field though, since they feel like these questions aren’t empirically verifiable (so they are considered less "interesting"). I also like to study other things that many academics aren’t necessarily comfortable with studying (perhaps since it is harder to be empirically rigorous), such as the possible social outcomes that could spring out of a radical social experiment. When you’re concerned with maintaining the accuracy of your map, it may come at the sacrifice of dA/dt, where A is area (so your Area increases more slowly with time).

I also feel that social breaching experiments are another interesting way of increasing the volume of my "map", since they help me test the robustness of my social models in situations that people are unaccustomed to. Hackers often perform these sorts of experiments to test the robustness of security systems (in fact, a low level of potentially embarrassing hacking is probably optimal when it comes to ensuring that the security system remains robust – although it’s entirely possible that even then, people may pay too much attention to certain models of hacking, causing potentially malicious hackers to dream up of new models of hacking).

With possibility space, you could code up the conditions of the environment in a k-dimensional space such as (1,0,0,1,0,…), where 1 indicates the existence of some variable in a particular environment, and 0 indicates the absence of such variable. We can then use Huffman Coding to indicate the frequency of the combination of each set of conditions in the set of environments we most frequently encounter (so then, less probable environments would have longer Huffman codes, or higher values of entropy/information).

As we know from Taleb’s book "The Black Swan", many people frequently underestimate the prevalence of "long tail" events (which are often part of the unrealized portion of possibility space, and have longer Huffman codes). This causes them to over-rely on Gaussian distributions even in situations where the Gaussian distributions may be inappropriate, and it is often said that this was one of the factors behind the recent financial crisis.

Now, what does this investigation of possibility space allow us to do? It allows us to re-examine the robustness of our formal system – how sensitive or flexible our system is with respect to continuing its duties in the face of perturbations in the environment we believe it’s applicable for. We often have a tendency to overestimate the consistency of the environment. But if we consistently try to test the boundary conditions, we might be able to better estimate the "map" that corresponds to the "territory" of different (or potentially novel) environments that exist in possibility space, but not yet in realized possibility space.

The thing is, though, that many people have a habitual tendency to avoid exploring boundary conditions. The fact is, that the space of realized events is always far smaller than the entirety of possibility space, and it is usually impractical to explore all of possibility space. Since our time is limited, and the payoffs of exploring the unrealized portions of possibility space uncertain (and often time-delayed, and also subject to hyperbolic time-discounting, especially when the payoffs may come only after a single person’s lifetime), people often don’t explore these portions of possibility space (although life extension, combined with various creative approaches to decrease people’s time preference, might change the incentives). Furthermore, we cannot empirically verify unrealized portions of possibility space using the traditional scientific method. Bayesian methods may be more appropriate, but even then, people may be susceptible to plugging the wrong values into the Bayesian formula (again, perhaps due to over-assuming continuity in environmental conditions). As in my original example about hacking, it is way too easy for the designers of security systems to use the wrong Bayesian priors when they are being observed by potential hackers, who may have an idea about ways that take advantage of the values of these Bayesian priors.